Hi,
Yes, it seems to be a drawback in the SI creation command. As Akash
pointed out, instead of we trying to make status for all the segments at
once we can do 2 things:
1. Load in batches(similar to what akash mentioned) and in case of some
failure just stop loading and do not fail the SI creation command, so that
the user can use reindex command to repair the remaining segments or can
trigger repair in next consecutive loads in case of any failures.
2. Provide a way to only load some user defined number of segments in the
SI instead if loading all at once. In this case, let's say the user wants
to create a SI table with 40000 segments. He can just create a table with
some 500 or 1000 segments initially. The user can then fire reindex command
to load the remaining segments or can repair the remaining segments using
load command and can repair in batches.
Others can give their input as well.
Regards
Vikram
On Tue, Mar 2, 2021 at 4:00 PM akashrn5 <
[hidden email]> wrote:
> Hi,
>
> yes, as you mentioned this is a major drawback in the current SI flow. This
> problem exists because, when we get the set of segments to load, we start
> an
> executor service and give all the segment list, after .get we make the
> status success at once.
>
> So we need to rewrite this code to make it like batch wise and avoid the
> problem.
>
>
> Regards,
> Akash R
>
>
>
> --
> Sent from:
>
http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/>