How to create fiscal calendar in Incorta?(Part 4)

 This part is about the holiday. I have two columns, holiday name, and day type. Use Spark SQL to complete.

df_ago.registerTempTable("Range_TBL2")
us_holidays = holidays.UnitedStates(years=range(2016,2024))
schema = StructType([
StructField('us_holidays', StringType(), True)
])
df_hoildays = sc.parallelize([(k,)+(v,) for k,v in us_holidays.items()]).toDF(['date','holliday_name'])
df_hoildays=df_hoildays.withColumnRenamed("date","date2")
df_hoildays.registerTempTable("Holiday_TBL")
#df=date_range.join(df_hoildays, "date_range.Date1 = df_hoildays.Date", 'inner')
df = spark.sql(
'''
select Range_TBL2.* ,
Holiday_TBL.holliday_name,
case when holliday_name is not null then 'HOLIDAY'
when Day_Name = 'Saturday' then 'WEEKEND'
when Day_Name = 'Sunday' then 'WEEKEND'
when Day_Name = 'Monday' then 'WEEKDAY'
when Day_Name = 'Tuesday' then 'WEEKDAY'
when Day_Name = 'Wednesday' then 'WEEKDAY'
when Day_Name = 'Thursday' then 'WEEKDAY'
when Day_Name = 'Friday' then 'WEEKDAY'
else null
end
day_type
from
Range_TBL2
left join
Holiday_TBL
on Range_TBL2.Date1 = Holiday_TBL.date2
'''
)
df = df.withColumnRenamed('Date1', 'Date')
save(df)

Below is a screenshot of the dashboard in Incorta.

For more details, please see my previous blog.

https://suziepyspark.blogspot.com/2020/10/using-pyspark-to-get-holidays-in-incorta.html

Comments

Popular posts from this blog

How to create histogram in Incorta use bin function.

Using Time Series Analysis Electric Production by ARIMA Model