Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

jackson-dataformat-csv: Mapping number value without POJO

Tags:

I'm trying to parse a CSV file using jackson-dataformat-csv and I want to map the numeric column to the Number java type.

CsvSchema schema = CsvSchema.builder().setUseHeader(true)
    .addColumn("firstName", CsvSchema.ColumnType.STRING)
    .addColumn("lastName", CsvSchema.ColumnType.STRING)
    .addColumn("age", CsvSchema.ColumnType.NUMBER)
    .build();

CsvMapper csvMapper = new CsvMapper();  

MappingIterator<Map<String, Object>> mappingIterator = csvMapper
        .readerFor(Map.class)
        .with(schema)
        .readValues(is);        

while (mappingIterator.hasNext()) {
    Map<String, Object> entryMap = mappingIterator.next();
    Number age = (Number) entryMap.get("age");
}       

I'm expecting entryMap.get("age") should be a Number, but I get String instead.

My CSV file:

firstName,lastName,age
John,Doe,21
Error,Name,-10

I know that CsvSchema works fine with POJOs, but I need to process arbitrary CSV schemas, so I can't create a new java class for every case.

Any way to parse CSV into a typed Map or Array?

like image 982
Igor Luzhanov Avatar asked Mar 11 '19 20:03

Igor Luzhanov


1 Answers

Right now it is not possible to configure Map deserialisation using CsvSchema. Process uses com.fasterxml.jackson.databind.deser.std.MapDeserializer which right now does not check schema. We could write custom Map deserialiser. There is a question on GitHub: CsvMapper does not respect CsvSchema.ColumnType when using @JsonAnySetter where cowtowncoder answered:

At this point schema type is not used much for anything, but I agree it should.

EDIT

I decided to take a look closer what we can do with that fact that com.fasterxml.jackson.databind.deser.std.MapDeserializer is used behind the scene. Implementing custom Map deserialiser which will take care about types would be tricky to implement and register but we can use knowledge about ValueInstantiator. Let's define new Map type which knows what to do with ColumnType info:

class CsvMap extends HashMap<String, Object> {

    private final CsvSchema schema;
    private final NumberFormat numberFormat = NumberFormat.getInstance();

    public CsvMap(CsvSchema schema) {
        this.schema = schema;
    }

    @Override
    public Object put(String key, Object value) {
        value = convertIfNeeded(key, value);
        return super.put(key, value);
    }

    private Object convertIfNeeded(String key, Object value) {
        CsvSchema.Column column = schema.column(key);
        if (column.getType() == CsvSchema.ColumnType.NUMBER) {
            try {
                return numberFormat.parse(value.toString());
            } catch (ParseException e) {
                // leave it as it is
            }
        }

        return value;
    }
}

For new type without no-arg constructor we should create new ValueInstantiator:

class CsvMapInstantiator extends ValueInstantiator.Base {

    private final CsvSchema schema;

    public CsvMapInstantiator(CsvSchema schema) {
        super(CsvMap.class);
        this.schema = schema;
    }

    @Override
    public Object createUsingDefault(DeserializationContext ctxt) {
        return new CsvMap(schema);
    }

    @Override
    public boolean canCreateUsingDefault() {
        return true;
    }
}

Example usage:

import com.fasterxml.jackson.databind.DeserializationContext;
import com.fasterxml.jackson.databind.MappingIterator;
import com.fasterxml.jackson.databind.ObjectReader;
import com.fasterxml.jackson.databind.deser.ValueInstantiator;
import com.fasterxml.jackson.databind.module.SimpleModule;
import com.fasterxml.jackson.dataformat.csv.CsvMapper;
import com.fasterxml.jackson.dataformat.csv.CsvSchema;

import java.io.File;
import java.io.IOException;
import java.text.NumberFormat;
import java.text.ParseException;
import java.util.HashMap;

public class CsvApp {

    public static void main(String[] args) throws IOException {
        File csvFile = new File("./resource/test.csv").getAbsoluteFile();

        CsvSchema schema = CsvSchema.builder()
                .addColumn("firstName", CsvSchema.ColumnType.STRING)
                .addColumn("lastName", CsvSchema.ColumnType.STRING)
                .addColumn("age", CsvSchema.ColumnType.NUMBER)
                .build().withHeader();

        // Create schema aware map module
        SimpleModule csvMapModule = new SimpleModule();
        csvMapModule.addValueInstantiator(CsvMap.class, new CsvMapInstantiator(schema));

        // register map
        CsvMapper csvMapper = new CsvMapper();
        csvMapper.registerModule(csvMapModule);

        // get reader for CsvMap + schema
        ObjectReader objectReaderWithSchema = csvMapper
                .readerWithSchemaFor(CsvMap.class)
                .with(schema);

        MappingIterator<CsvMap> mappingIterator = objectReaderWithSchema.readValues(csvFile);

        while (mappingIterator.hasNext()) {
            CsvMap entryMap = mappingIterator.next();

            Number age = (Number) entryMap.get("age");
            System.out.println(age + " (" + age.getClass() + ")");
        }
    }
}

Above code for below CSV payload:

firstName,lastName,age
John,Doe,21
Error,Name,-10.1

prints:

21 (class java.lang.Long)
-10.1 (class java.lang.Double)

It looks like a hack but I wanted to show this possibility.

like image 63
Michał Ziober Avatar answered Oct 04 '22 09:10

Michał Ziober