Get difference between two timestamps in hours, minutes & seconds in Pyspark

In order to get difference between two timestamps in hours, minutes & seconds in pyspark we find difference between two timestamp in seconds and convert them to hours and minutes. We look at an example on how to get difference between two timestamps in seconds in pyspark. By dividing the result by 60 we get the difference between two timestamps in minutes in pyspark.  By dividing the result by 3600 we get the difference between two timestamps in hours in pyspark. Let’s see an Example for each.

  • Calculate difference between two timestamp in hours in pyspark
  • Calculate difference between two timestamp in minutes in pyspark
  • Calculate difference between two timestamp in seconds in pyspark

We will be using the dataframe named df1

Get difference between two timestamp in hours, minutes & seconds in Pyspark 1

 

 

Calculate difference between two timestamp in seconds in pyspark

In order to calculate the difference between two timestamp in seconds we calculate difference between two timestamp by casting them to long as shown below

### Calculate difference between two timestamp in seconds in pyspark

from pyspark.sql.functions import *

diff_secs_col = col("current_time").cast("long") - col("birthdaytime").cast("long")
df2 = df1.withColumn( "diff_secs", diff_secs_col )
df2.show(truncate=False)

so the resultant dataframe will beGet difference between two timestamp in hours, minutes & seconds in Pyspark 2

 

 

Calculate difference between two timestamp in minutes in pyspark

In order to calculate the difference between two timestamp in minutes, we calculate difference between two timestamp by casting them to long as shown below this will give difference in seconds and then we divide it by 60 to get the difference in minutes


### Calculate difference between two timestamp in minutes in pyspark

from pyspark.sql.functions import *

diff_secs_col = col("current_time").cast("long") - col("birthdaytime").cast("long")
df2 = df1.withColumn( "diff_mins", diff_secs_col/ 60 )
df2.show()

So the resultant dataframe with difference between two timestamps in minutes will be
Get difference between two timestamp in hours, minutes & seconds in Pyspark 3

 

 

Calculate difference between two timestamp in hours in pyspark

In order to calculate the difference between two timestamp in minutes, we calculate difference between two timestamp by casting them to long as shown below this will give difference in seconds and then we divide it by 3600 to get the difference in hours


### Calculate difference between two timestamp in hours in pyspark

from pyspark.sql.functions import *

diff_secs_col = col("current_time").cast("long") - col("birthdaytime").cast("long")
df2 = df1.withColumn( "diff_hours", diff_secs_col/ 3600 )
df2.show()

So the resultant dataframe with difference between two timestamps in hours will be
Get difference between two timestamp in hours, minutes & seconds in Pyspark 4

similar to difference between two timestamps in hours, minutes & seconds in Pyspark. we have also looked  at difference between two dates in previous chapter using date_diff() function.


Other Related Topics:

 

Get difference between two timestamps in hours, minutes & seconds in Pyspark                                                                                            

Author

  • Sridhar Venkatachalam

    With close to 10 years on Experience in data science and machine learning Have extensively worked on programming languages like R, Python (Pandas), SAS, Pyspark.