Suppose I am given the following csv file:
and I want to produce the first 2 elements of each group labeled by column 1.
This is not rocket science, but also not as straight forward as say sorting or uniquing in bash. In the special case where k = 1, one could use the following sort syntax:
sort -t, -k1,1 u input.csv > output.csv
However I have been under the impression there is no straightforward way to go beyond k = 1. Today I found a way using awk:
awk ‘!(a[$1]++ > (‘$((k-1))’))’ input.csv > output.csv
thanks to this article: