Dataframes in Apache Spark are immutable. SO you cannot change it, to delete rows from data frame you can filter the row that you do not want and save in another dataframe.

Answer from koiralo on Stack Overflow
🌐
Srinimf
srinimf.com › 2024 › 03 › 25 › 5-best-ways-to-delete-rows-in-pyspark
5 Best Ways to Delete Rows in PySpark – Srinimf
March 25, 2024 - In PySpark, delete rows from DataFrame: filter, where, na.drop, drop, SQL Expression based on criteria.
🌐
Py-spark-sql
py-spark-sql.com › home › tutorials › beginner › delete with where
SQL DELETE with WHERE — Targeted Row Removal Tutorial
5 days ago - DELETE FROM t WHERE NOT EXISTS (SELECT 1 FROM other WHERE other.fk = t.pk) removes rows from t that have no matching record in other. For each row in t, the correlated subquery runs — if it returns zero rows, NOT EXISTS is TRUE and the row ...
Discussions

pyspark - How to delete rows efficiently in sparksql? - Stack Overflow
I get a view with IDs for which I have to delete the corresponding records in a table present in a database. View: |columnName| ------------ | 1 | | 2 | | 3 | | 4 | More on stackoverflow.com
🌐 stackoverflow.com
sql server - Delete records from table before writing dataframe - pyspark - Stack Overflow
I'm trying to delete records from my table before writing data into it from dataframe. Its not working for me ... What am I doing wrong? Goal: "delete from xx_files_tbl" before writing new More on stackoverflow.com
🌐 stackoverflow.com
apache spark - Remove rows from dataframe based on condition in pyspark - Stack Overflow
I have one dataframe with two columns: +--------+-----+ | col1| col2| +--------+-----+ |22 | 12.2| |1 | 2.1| |5 | 52.1| |2 | 62.9| |77 | 33.3| I would like to creat... More on stackoverflow.com
🌐 stackoverflow.com
pyspark - Delete rows based on condition from SQL Server using Databricks Notebook - Stack Overflow
In a Databricks notebook I'm calling my custom function that creates a session for connecting to a SQL Server database: spark = jdbc_spark_session( database=database, username=username, More on stackoverflow.com
🌐 stackoverflow.com
People also ask

What is py-spark-sql.com?
py-spark-sql.com is a free interactive learning platform for Python, PySpark and SQL. It offers 300+ practice exercises with instant code execution directly in your browser — no installation needed.
🌐
py-spark-sql.com
py-spark-sql.com › home › tutorials › beginner › delete with where
SQL DELETE with WHERE — Targeted Row Removal Tutorial
Is py-spark-sql.com free?
Yes — all tutorials, practice exercises and the interactive code editor are completely free to use.
🌐
py-spark-sql.com
py-spark-sql.com › home › tutorials › beginner › delete with where
SQL DELETE with WHERE — Targeted Row Removal Tutorial
Do I need to install SQL to use this platform?
No. SQL and relational databases runs entirely in your browser. You can start practising immediately without any setup or installation.
🌐
py-spark-sql.com
py-spark-sql.com › home › tutorials › beginner › delete with where
SQL DELETE with WHERE — Targeted Row Removal Tutorial
🌐
Stack Overflow
stackoverflow.com › questions › 74107351 › how-to-delete-rows-efficiently-in-sparksql
pyspark - How to delete rows efficiently in sparksql? - Stack Overflow
spark.sql("""DELETE FROM """+database+"""."""+tableName+""" WHERE """+columnName+""" IN ( """+st+""") """) Here, st has all the unique values, for eg. if columnName is empid, then st with have 1,2,3,etc which I extract from the view using python. ... Both of the approaches work for me. Logic 1 is way faster than the merge logic but it still takes 20+ minutes because I have a lot of rows in my tables.
Find elsewhere
🌐
Databricks
docs.databricks.com › sql language reference › delete from
DELETE FROM | Databricks on AWS
March 16, 2026 - [ common_table_expression ] DELETE FROM table_name [table_alias] [WHERE predicate] ... Common table expressions (CTE) are one or more named queries which can be reused multiple times within the main query block to avoid repeated computations or to improve readability of complex, nested queries. ... Identifies an existing table. The name must not include a temporal specification. ... Define an alias for the table. The alias must not include a column list. ... Filter rows by predicate.
🌐
Spark By {Examples}
sparkbyexamples.com › home › apache spark › spark drop, delete, truncate differences
Spark Drop, Delete, Truncate Differences - Spark By {Examples}
March 27, 2024 - Let's discuss the differences between drop, delete, and truncate using Spark SQL. Even though Drop, Delete, and Truncate sound the same but, there is a
🌐
Kontext
kontext.tech › home › blogs › spark & pyspark › "delete" rows (data) from pyspark dataframe
"Delete" Rows (Data) from PySpark DataFrame - Kontext Labs
September 25, 2021 - We can use where or filterfunction to 'remove' or 'delete' rows from a DataFrame. from pyspark.sql import SparkSession appName = "Python Example - 'Delete' Data from DataFrame" master = "local" # Create Spark session spark = SparkSession.builder ...
🌐
Stack Overflow
stackoverflow.com › questions › 78850219 › delete-rows-based-on-condition-from-sql-server-using-databricks-notebook
pyspark - Delete rows based on condition from SQL Server using Databricks Notebook - Stack Overflow
%scala import java.util.Properties import java.sql.DriverManager val jdbcUsername = "admin02" val jdbcPassword = "Welcome@1" val driverClass = "com.microsoft.sqlserver.jdbc.SQLServerDriver" val jdbcUrl = s"jdbc:sqlserver://dilipsqlser.database.windows.net:1433;database=db02;encrypt=true;trustServerCertificate=false;hostNameInCertificate=*.database.windows.net;loginTimeout=30;" val connectionProperties = new Properties() connectionProperties.put("user", jdbcUsername) connectionProperties.put("password", jdbcPassword) connectionProperties.setProperty("Driver", driverClass) val connection = DriverManager.getConnection(jdbcUrl, jdbcUsername, jdbcPassword) val sql = "DELETE FROM Events WHERE EventId < 10003" val stmt = connection.createStatement() stmt.execute(sql) stmt.close() connection.close() println("Rows with EventId < 1003 have been deleted.")
🌐
Medium
medium.com › @shuklaprashant9264 › deltalake-delete-rows-that-match-a-predicate-condition-3dc66329d2cc
DeltaLake Delete rows that match a predicate condition | by PrashantShukla | Medium
May 4, 2023 - Here’s an example of how you can delete rows from a Delta table that match a predicate condition: from delta import DeltaTable from pyspark.sql.functions import col
🌐
Delta Lake
delta.io › blog › 2022-12-07-delete-rows-from-delta-lake-table
How to Delete Rows from a Delta Lake Table | Delta Lake
December 7, 2022 - Delete all the rows where the age is greater than 75. import delta import pyspark.sql.functions as F dt = delta.DeltaTable.forPath(spark, "tmp/sunny-table") dt.delete(F.col("age") > 75)
🌐
GeeksforGeeks
geeksforgeeks.org › python › drop-rows-in-pyspark-dataframe-with-condition
Drop Rows in PySpark DataFrame with Condition - GeeksforGeeks
July 23, 2025 - The dropna() method is used to remove rows that contain any null (missing) values in the DataFrame. It’s handy for quick cleanup when you want to keep only fully complete records ... import pyspark from pyspark.sql import SparkSession spark ...
🌐
Saturn Cloud
saturncloud.io › blog › how-to-remove-rows-in-a-spark-dataframe-based-on-position-a-comprehensive-guide
How to Remove Rows in a Spark Dataframe Based on Position: A Guide | Saturn Cloud Blog
January 11, 2024 - To remove rows based on their position, we’ll need to add an index column to the DataFrame, which will allow us to identify each row’s position. Once we have this, we can filter out the rows we don’t want.
🌐
Stack Overflow
stackoverflow.com › questions › 59931753 › pyspark-delete-row-from-postgresql
Pyspark delete row from PostgreSQL - Stack Overflow
January 27, 2020 - Dataframes in Apache Spark are immutable. You can filter out the rows you don't want. See the documentation. ... df = spark.jdbc("conn-url", "mytable") df.createOrReplaceTempView("mytable") df2 = spark.sql("SELECT * FROM mytable WHERE day != 3") df2.collect()
🌐
Skytowner
skytowner.com › explore › removing_rows_that_contain_specific_substring_in_pyspark_dataframe
Removing rows that contain specific substring in PySpark DataFrame
To remove rows that contain specific substring (e.g. '#') in PySpark DataFrame, use the contains(~) method: from pyspark.sql import functions as F ·