Here is my schema
root
|-- DataPartition: string (nullable = true)
|-- TimeStamp: string (nullable = true)
|-- PeriodId: long (nullable = true)
|-- FinancialAsReportedLineItemName: struct (nullable = true)
| |-- _VALUE: string (nullable = true)
| |-- _languageId: long (nullable = true)
|-- FinancialLineItemSource: long (nullable = true)
|-- FinancialStatementLineItemSequence: long (nullable = true)
|-- FinancialStatementLineItemValue: double (nullable = true)
|-- FiscalYear: long (nullable = true)
|-- IsAnnual: boolean (nullable = true)
|-- IsAsReportedCurrencySetManually: boolean (nullable = true)
|-- IsCombinedItem: boolean (nullable = true)
|-- IsDerived: boolean (nullable = true)
|-- IsExcludedFromStandardization: boolean (nullable = true)
|-- IsFinal: boolean (nullable = true)
|-- IsTotal: boolean (nullable = true)
|-- ParentLineItemId: long (nullable = true)
|-- PeriodPermId: struct (nullable = true)
| |-- _VALUE: long (nullable = true)
| |-- _objectTypeId: long (nullable = true)
|-- ReportedCurrencyId: long (nullable = true)
From the above schema i am trying to do like this
val temp = tempNew1
.withColumn("FinancialAsReportedLineItemName", $"FinancialAsReportedLineItemName._VALUE")
.withColumn("FinancialAsReportedLineItemName_languageId", $"FinancialAsReportedLineItemName._languageId")
.withColumn("PeriodPermId", $"PeriodPermId._VALUE")
.withColumn("PeriodPermId_objectTypeId", $"PeriodPermId._objectTypeId").drop($"AsReportedItem").drop($"AsReportedItem")
I don't know what i am missing here . I get below error
Exception in thread "main" org.apache.spark.sql.AnalysisException: Can't extract value from FinancialAsReportedLineItemName#2262: need struct type but got string;
The issue is that you are trying to access FinancialAsReportedLineItemName._languageId
when FinancialAsReportedLineItemName
column has been replaced by FinancialAsReportedLineItemName._VALUE
you should be changing the following two lines
.withColumn("FinancialAsReportedLineItemName", $"FinancialAsReportedLineItemName._VALUE")
.withColumn("FinancialAsReportedLineItemName_languageId", $"FinancialAsReportedLineItemName._languageId")
to
.withColumn("FinancialAsReportedLineItemName_value", $"FinancialAsReportedLineItemName._VALUE")
.withColumn("FinancialAsReportedLineItemName_languageId", $"FinancialAsReportedLineItemName._languageId")
If FinancialAsReportedLineItemName_value
column name is supposed to be FinancialAsReportedLineItemName
then you should be swapping the withColumns
as
.withColumn("FinancialAsReportedLineItemName_languageId", $"FinancialAsReportedLineItemName._languageId")
.withColumn("FinancialAsReportedLineItemName", $"FinancialAsReportedLineItemName._VALUE")
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With