You can use the native Spark function abs():
from pyspark.sql.functions import abs
df1 = df.withColumn('Value',abs(df.Value))
df1.show()
+---+-----+
| Id|Value|
+---+-----+
| a| 5.0|
| b| 1.0|
| c| 0.3|
+---+-----+
Answer from mtoto on Stack OverflowVideos
That's not how you use pyspark.sql.functions. There are not designed to be evaluated outside DataFrame context, and operate on Columns.
You could use literal Column:
from pyspark.sql.functions import abs, lit
abs(lit(numb))
but it'll give you yet another Column:
Column<b'abs(-2)'>
While in theory such objects can be evaluated locally, it is not intended for public usage.
If you want to operate on plain Python numerics just stick to Python's built-in abs.
If you've shaded built-in functions you can express the function from the comments as:
def math_result(current_val, value):
result = ((value - current_val) / value) *100
return __builtins__.abs(__builtins__.round(result, 2))
math_result(1, 3)
## 66.67
Python has built-in abs method.
pyspark also provides abs method but that is for DataFrame column.
If you do import pyspark method 'abs' in pyspark shell then you do override built-in abs method.
It looks like you have override abs method something like below:
>>> print(abs(-3))
3
>>> from pyspark.sql.functions import abs
>>> print(abs(-3))
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/Users/rs301t/spark-2.3.2-bin-hadoop2.7/python/pyspark/sql/functions.py", line 42, in _
jc = getattr(sc._jvm.functions, name)(col._jc if isinstance(col, Column) else col)
File "/Users/rs301t/spark-2.3.2-bin-hadoop2.7/python/lib/py4j-0.10.7-src.zip/py4j/java_gateway.py", line 1257, in __call__
File "/Users/rs301t/spark-2.3.2-bin-hadoop2.7/python/pyspark/sql/utils.py", line 63, in deco
return f(*a, **kw)
File "/Users/rs301t/spark-2.3.2-bin-hadoop2.7/python/lib/py4j-0.10.7-src.zip/py4j/protocol.py", line 332, in get_return_value
py4j.protocol.Py4JError: An error occurred while calling z:org.apache.spark.sql.functions.abs. Trace:
py4j.Py4JException: Method abs([class java.lang.Integer]) does not exist
at py4j.reflection.ReflectionEngine.getMethod(ReflectionEngine.java:318)
at py4j.reflection.ReflectionEngine.getMethod(ReflectionEngine.java:339)
at py4j.Gateway.invoke(Gateway.java:276)
at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
at py4j.commands.CallCommand.execute(CallCommand.java:79)
at py4j.GatewayConnection.run(GatewayConnection.java:238)
at java.lang.Thread.run(Thread.java:748)
We should avoid import functions name directly and try to use alias for module so can call right method which we intended.
There's an abs function available in Spark SQL
You can either use selectExpr instead of select
PSPDF.selectExpr("accountname","abs(amount) as amount", "currency", "datestamp","orderid","transactiontype")
or use select's overloaded version that takes columns types:
PSPDF.select($"accountname", abs($"amount").as("amount"), $"currency", $"datestamp", $"orderid", $"transactiontype")
You can use when and negate inbuilt function as
import org.apache.spark.sql.functions._
val FilteredPSPDF = PSPDF.select(col("accountname"), when(col("amount") < 0, negate(col("amount"))).otherwise(col("amount")), col("currency"), col("datestamp"),col("orderid"),col("transactiontype"))