Find & extract data from spreadsheet

Extracting data from Office files—such as spreadsheets—can sometimes be challenging, especially because some file formats (e.g., XLSX) are not easily manipulated directly with command-line tools.

In this example, a spreadsheet has been converted to a CSV file that consists of three columns: Material ID, Production, and Uploaded.

Example spreadheet
Material ID;Production;Uploaded;
PR2SCU4K;17:30 Daily Production (2025.03.24);Yes;
PR2SCU4M;17:30 Daily Production (2025.03.25);Yes;
PR2SCU4N;17:30 Daily Production (2025.03.26);Yes;
PR2SCU4P;17:30 Daily Production (2025.03.27);Yes;
PR2SCU4Q;17:30 Daily Production (2025.03.28);Yes;
PR2SCU4R;17:30 Daily Production (2025.03.31);;
PR2SCU4S;17:30 Daily Production (2025.04.01);;
PR2SCU4T;17:30 Daily Production (2025.04.02);;

The goal is to locate a specific date (for instance, today's date) and extract the corresponding Material ID. A shell script is used to search for the desired date within the CSV file and output the associated Material ID.

The command used is:

Here’s what each part does:

  • grep -E "(${SEARCH_DATE})" "$CSV_FILE": Searches the CSV file for lines that match the date stored in the SEARCH_DATE variable.

  • awk -F';' '{print $1}': Uses awk with a semicolon (;) as the field delimiter to extract the first column (Material ID) from the matched lines.

Note: If your CSV file uses a different delimiter (such as a comma), update the -F';' option in the awk command accordingly.

To extract for example the Upload field, this could be done with {print $3}.

Example:

search_date = 2025.03.25

MID = PR2SCU4M

Last updated