# Sum of two or more columns in pyspark

In order to calculate sum of two or more columns in pyspark. we will be using + operator of the column to calculate sum of columns. Second method is to calculate sum of columns in pyspark and add it to the dataframe by using simple + operation along with select Function. Let’s see an example of each.

• Sum of two or more columns in pyspark using + and select()
• Sum of multiple columns in pyspark and appending to dataframe

We will be using the dataframe df_student_detail. #### Sum of two or more columns in pyspark : Method 1

• In Method 1 we will be using simple + operator to calculate sum of multiple columns. we will also be using select() function along with the + operator
```### Sum of two or more columns in pyspark

from pyspark.sql.functions import col

df1=df_student_detail.select(((col("mathematics_score") + col("science_score"))).alias("sum"))
df1.show()
```

This method simply adds up and produce the resultant column as shown below. #### Sum of multiple columns in pyspark and appending to dataframe: Method 2

In Method 2 we will be using simple + operator to calculate sum of two or more columns, and appending the results to the dataframe by naming the column as sum

```### Sum of two or more columns in pyspark

from pyspark.sql.functions import col

df1=df_student_detail.withColumn("sum", col("mathematics_score")+col("science_score"))
df1.show()
```

so we will be adding the two columns namely “mathematics_score” and “science_score”,  then storing the result in the column named “sum” as shown below in the resultant dataframe. ## Author

• With close to 10 years on Experience in data science and machine learning Have extensively worked on programming languages like R, Python (Pandas), SAS, Pyspark.