Get difference between two dates in days,weeks, years, months and quarters in pyspark

In order to get difference between two dates in days, years, months and quarters in pyspark can be accomplished by using datediff() and months_between()  function. datediff() Function calculates the difference between two dates in days in pyspark. Dividing the result by 365.25 we will get the difference between two dates in years in pyspark and if we divide the results by 52 we will get the difference between two dates in weeks in pyspark.  Months_between() Function calculates the difference between two dates in months in pyspark. Dividing the result by 4 we will get the difference between two dates in quarter in pyspark. Let’s see an Example for each.

  • Calculate difference between two dates in days in pyspark
  • Calculate difference between two dates in weeks in pyspark
  • Calculate difference between two dates in months in pyspark
  • Calculate difference between two dates in years in pyspark
  • Calculate difference between two dates in quarters in pyspark

Get difference between two dates in days,weeks, years, months and quarters in pyspark d1

We will be using the dataframe named df1

Get difference between two dates in days, years months and quarters in pyspark 1

 

 

 

Calculate difference between two dates in days in pyspark

In order to calculate the difference between two dates in days we use datediff() function. datediff() function takes two argument, both are date on which we need to find the difference between two dates.


### Calculate difference between two dates in days in pyspark

from pyspark.sql.functions import datediff,col

df1.withColumn("diff_in_days", datediff(col("current_time"),col("birthdaytime"))).show(truncate=False)

So the resultant dataframe will beGet difference between two dates in days, years months and quarters in pyspark 2

 

 

 

Calculate difference between two dates in months in pyspark

In order to calculate the difference between two dates in months we use months_between() function. months_between() function takes two argument, both are date on which we need to find the difference between two dates in months.

### Calculate difference between two dates in months in pyspark

from pyspark.sql.functions import months_between,col

df1.withColumn("diff_in_months", months_between(col("current_time"),col("birthdaytime"))).show(truncate=False)

So the resultant dataframe will beGet difference between two dates in days, years months and quarters in pyspark 3

 

 

Calculate difference between two dates in weeks in pyspark

In order to calculate the difference between two dates in weeks we use datediff() function. datediff() function takes two argument, both are date and returns the difference between two dates in days. We divide the result by 52 to calculate the difference between two dates in weeks as shown below

### Calculate difference between two dates in week in pyspark

from pyspark.sql.functions import datediff,col

df1.withColumn("diff", datediff(col("current_time"),col("birthdaytime"))/52).show()

So the resultant dataframe will be

Get difference between two dates in days,weeks, years, months and quarters in pyspark c1

 

Calculate difference between two dates in quarters in pyspark

In order to calculate the difference between two dates in months we use months_between() function. months_between() function takes two argument, both are date and returns the difference between two dates in months. We divide the result by 4 to calculate the difference between two dates in quarter as shown below

### Calculate difference between two dates in quarters in pyspark

from pyspark.sql.functions import months_between,col

df1.withColumn("diff_in_quaters", months_between(col("current_time"),col("birthdaytime"))/4).show(truncate=False)

So the resultant dataframe will beGet difference between two dates in days, years months and quarters in pyspark 4

 

 

 

Calculate difference between two dates in years in pyspark

In order to calculate the difference between two dates in months we use datediff() function. datediff() function takes two argument, both are date and returns the difference between two dates in days. We divide the result by 365.25 to calculate the difference between two dates in years as shown below

### Calculate difference between two dates in years in pyspark

from pyspark.sql.functions import datediff,col

df1.withColumn("diff_in_years", datediff(col("current_time"),col("birthdaytime"))/365.25).show()

So the resultant dataframe will beGet difference between two dates in days, years months and quarters in pyspark 5

similar to difference between two dates in days, years months and quarters in pyspark. Lets look at difference between two timestamps in next chapter.


Other related topics :

 

Get difference between two dates in days, years months and quarters in pyspark                                                                                              Get difference between two dates in days, years months and quarters in pyspark

Author

  • Sridhar Venkatachalam

    With close to 10 years on Experience in data science and machine learning Have extensively worked on programming languages like R, Python (Pandas), SAS, Pyspark.