[GitHub] [carbondata] nihal0107 opened a new pull request #4116: [CARBONDATA-4162] Leverage Secondary Index till segment level with Spark plan rewrite

classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] nihal0107 opened a new pull request #4116: [CARBONDATA-4162] Leverage Secondary Index till segment level with Spark plan rewrite

GitBox

nihal0107 opened a new pull request #4116:
URL: https://github.com/apache/carbondata/pull/4116


    ### Why is this PR needed?
    In the existing architecture, if the parent(main) table and SI table don’t have the same valid segments then we disable the SI table. And then from the next query onwards, we scan and prune only the parent table until we trigger the next load or REINDEX command (as these commands will make the parent and SI table segments in sync). Because of this, queries take more time to give the result when SI is disabled.
   
    ### What changes were proposed in this PR?
   1. Instead of disabling the SI table(when parent and child table segments are not in sync) we will do pruning on SI tables for all the valid segments(segments with status success, marked for update and load partial success) and the rest of the segments will be pruned by the parent table.
   2. Now, different SI tables may contain different numbers of segments. In that case, made the changes to identify the best fit SI table based on segment count. If more than one SI table contains the same segment count then identify the best fit SI table based on the current design.
       
    ### Does this PR introduce any user interface change?
    - No
   
    ### Is any new testcase added?
    - Yes
   
       
   


--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4116: [CARBONDATA-4162] Leverage Secondary Index till segment level with Spark plan rewrite

GitBox

CarbonDataQA2 commented on pull request #4116:
URL: https://github.com/apache/carbondata/pull/4116#issuecomment-813379819


   Build Failed  with Spark 2.4.5, Please check CI http://121.244.95.60:12602/job/ApacheCarbon_PR_Builder_2.4.5/3374/
   


--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4116: [CARBONDATA-4162] Leverage Secondary Index till segment level with Spark plan rewrite

GitBox
In reply to this post by GitBox

CarbonDataQA2 commented on pull request #4116:
URL: https://github.com/apache/carbondata/pull/4116#issuecomment-813382532


   Build Failed  with Spark 2.3.4, Please check CI http://121.244.95.60:12602/job/ApacheCarbonPRBuilder2.3/5125/
   


--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4116: [WIP][CARBONDATA-4162] Leverage Secondary Index till segment level with Spark plan rewrite

GitBox
In reply to this post by GitBox

CarbonDataQA2 commented on pull request #4116:
URL: https://github.com/apache/carbondata/pull/4116#issuecomment-813850926


   Build Failed  with Spark 2.4.5, Please check CI http://121.244.95.60:12602/job/ApacheCarbon_PR_Builder_2.4.5/3378/
   


--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4116: [WIP][CARBONDATA-4162] Leverage Secondary Index till segment level with Spark plan rewrite

GitBox
In reply to this post by GitBox

CarbonDataQA2 commented on pull request #4116:
URL: https://github.com/apache/carbondata/pull/4116#issuecomment-813851440


   Build Failed  with Spark 2.3.4, Please check CI http://121.244.95.60:12602/job/ApacheCarbonPRBuilder2.3/5129/
   


--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4116: [WIP][CARBONDATA-4162] Leverage Secondary Index till segment level with Spark plan rewrite

GitBox
In reply to this post by GitBox

CarbonDataQA2 commented on pull request #4116:
URL: https://github.com/apache/carbondata/pull/4116#issuecomment-813910798


   Build Failed  with Spark 2.4.5, Please check CI http://121.244.95.60:12602/job/ApacheCarbon_PR_Builder_2.4.5/3382/
   


--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4116: [WIP][CARBONDATA-4162] Leverage Secondary Index till segment level with Spark plan rewrite

GitBox
In reply to this post by GitBox

CarbonDataQA2 commented on pull request #4116:
URL: https://github.com/apache/carbondata/pull/4116#issuecomment-813911488


   Build Failed  with Spark 2.3.4, Please check CI http://121.244.95.60:12602/job/ApacheCarbonPRBuilder2.3/5133/
   


--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4116: [WIP][CARBONDATA-4162] Leverage Secondary Index till segment level with Spark plan rewrite

GitBox
In reply to this post by GitBox

CarbonDataQA2 commented on pull request #4116:
URL: https://github.com/apache/carbondata/pull/4116#issuecomment-814159308


   Build Failed  with Spark 2.4.5, Please check CI http://121.244.95.60:12602/job/ApacheCarbon_PR_Builder_2.4.5/3384/
   


--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


Reply | Threaded
Open this post in threaded view
|

[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4116: [WIP][CARBONDATA-4162] Leverage Secondary Index till segment level with Spark plan rewrite

GitBox
In reply to this post by GitBox

CarbonDataQA2 commented on pull request #4116:
URL: https://github.com/apache/carbondata/pull/4116#issuecomment-814164174


   Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12602/job/ApacheCarbonPRBuilder2.3/5135/
   


--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]