This tutorial explains how you can split content of CSV files to be indexed as multiple values in Solr multivalued fields.
Release Apache Solr
You have a CSV file which you want to index into Solr. Some of the columns of the file contain more than one value, but indexing the CSV file reflects these multiple values as one entry in the Solr document’s field. Let the following be your CSV data stored in products.csv:
1,”Harry Potter”,”book;movie;PC game”
You would like “book”, “movie” and “PC game” to be represented as separate values in the Solr index.
The solution is to use multivalued=”true” for the field “categories” and tell the indexing update request handler to split the values on a particular symbol (“;” in this example). Using post.jar you can index the CSV file with the following command:
-jar post.jar products.csv
This tells Solr a) to split the contents of categories (f.categories.split) and b) that the different values are separated by “;” (f.categories.separator).
Hint: You could also define the additional parameters for the /update/csv handler in solrconfig.xml so you won’t need to add them to every single update command.