After #349, we support appending DataFile now. But I found there are some check may miss now: When we append DataFile, schema evolution or partition evolution may happen in the table after we generate the DataFile, which will cause the info of DataFile invalid. E.g partition value in DataFile will be invalid when partition evolution happen. lower_bound(upper_bound) will be invalid when schema evolution happen. So we need to detect the case that DataFile is incompatible with table.
For partition evolution, we have two ways to detect:
- Ensure that the partition value schema matches the existing partition spec in terms of type, this is the way we have now. But there are some case it can't detect for this way, e.g. partition spec type <p1: int, p2: int> reorder to <p2: int, p1: int>
- Ensure that the partition value schema matches the existing partition spec in terms of field name or field id.
For schema evolution:
- It may still lead to partition evolution, and the detection method for partition values is the same as mentioned above.
- Check whether the lower_bound/upper_bound is match using the field ID.
Based on the above analysis, we need to make the following fixes:
I'm not sure whether my understand is correct, please correct me if something wrong. cc @Fokko @liurenjie1024 @Xuanwo
After #349, we support appending DataFile now. But I found there are some check may miss now: When we append DataFile, schema evolution or partition evolution may happen in the table after we generate the DataFile, which will cause the info of DataFile invalid. E.g partition value in DataFile will be invalid when partition evolution happen. lower_bound(upper_bound) will be invalid when schema evolution happen. So we need to detect the case that DataFile is incompatible with table.
For partition evolution, we have two ways to detect:
For schema evolution:
Based on the above analysis, we need to make the following fixes:
I'm not sure whether my understand is correct, please correct me if something wrong. cc @Fokko @liurenjie1024 @Xuanwo