I have below dataframe with only one column as value
abc,1,2,345,765,876,Kumar r,Raghvan ,04041996
abc,1,2,345,765,876,"sam Bailey,20541789 #here double quote already present after 6th comma
abc,1011,2,32,678,,,,,
I am looking for regular expression in pyspark which add quotes after 6th comma and before digits .
expected output for above values are below
abc,1,2,345,765,876,"Kumar r,Raghvan" ,04041996
abc,1,2,345,765,876,"sam Bailey",20541789
abc,1011,2,32,678,,,,,
I have tried with below code but not received expected outcome
What I have tried:
df_with_quotes = df.withColumn("data_with_quotes",regexp_replace(col("data"), r"((?:[^,],){6})([^"].[^"$])(,[^,]+$)", r'\1"\2"\3'))