Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I use sqlContext (to execute SQL queries) in the Python transform?

I have done the following in Code Repositories

@transform_df(
    Output(test_dataset_path),
    df=Input(og_dataset_path)
)
def compute(ctx, df):
    ctx.spark_session.sql(f'''
    CREATE TABLE `test_dataset_path` AS
    SELECT * FROM `og_dataset_path`
    ''')

    return ctx.spark_session.sql(f'''
    SELECT * FROM `og_dataset_path`
    ''')

and it is erroring out on the code:

ctx.spark_session.sql(f'''
CREATE TABLE `test_dataset_path` AS
SELECT * FROM `og_dataset_path`
''')

with the error:

pyspar.sql.utils.AnanlysisException: Table or view not found: og_dataset_path

How can I resolve this error?

like image 418
Chloe Lathe Avatar asked Dec 04 '25 09:12

Chloe Lathe


1 Answers

Using createOrReplaceTempView should resolve this problem:

from transforms.api import transform_df, Input, Output

@transform_df(
     Output("/Users/XXXXX/sqlcsvA2"),
     ALL=Input("/datasources/locations/data/cleaned")
)
def my_compute_function(ctx, ALL):
    ALL.createOrReplaceTempView('ALL')
    return ctx.spark_session.sql('select * from ALL limit 10')
like image 114
Chloe Lathe Avatar answered Dec 12 '25 15:12

Chloe Lathe



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!