Item request has been placed! ×
Item request cannot be made. ×
loading  Processing Request

Method and apparatus for comparing rows of subsets of two database tables without regard to an original order of rows in either database table

Item request has been placed! ×
Item request cannot be made. ×
loading   Processing Request
  • Publication Date:
    April 08, 2025
  • Additional Information
    • Patent Number:
      12271,367
    • Appl. No:
      18/747171
    • Application Filed:
      June 18, 2024
    • Abstract:
      A method and system are provided for comparing a first database table and a second database table with each other to produce a table of differences therebetween. The first database table and the second database table include one or more subsets of data that are intended to match each other, and the first database table and the second database table each include multiple rows. Indicia is generated for each of the multiple rows of a subset of the first database table, and each of the multiple rows of a subset of the second database table. The generated indicia for each of the multiple rows of the subset of the first database table are compared with the generated indicia for the subset of each of the multiple rows of the subset of the second database table, on a row-by-row basis, wherein the comparing occurs without regard to an original order of the rows in either the first database table or the second database table. Any generated indicia for each of the multiple rows of the subset of the first database table that are not present in the generated indicia for each of the multiple rows of the subset of the second database table are identified as a first set of mismatched indicia. Any generated indicia for each of the multiple rows of the subset of the second database table that are not present in the generated indicia for each of the multiple rows of subset of the first database table are identified as a second set of mismatched indicia. The table of differences is populated with the generated indicia. The second database table may be repaired by using at least one of the first set of mismatched indicia and the second set of mismatched indicia.
    • Inventors:
      Gravic, Inc. (Malvern, PA, US)
    • Assignees:
      GRAVIC, INC (Malvern, PA, US)
    • Claim:
      1. An automated method for comparing a first database table and a second database table with each other to produce a table of differences therebetween, wherein the first database table and the second database table include one or more subsets of data that are intended to match each other, the first database table and the second database table each including multiple rows, the method comprising: (a) generating, by a processor, indicia for: (i) each of the multiple rows of a subset of the first database table, wherein the subset of the first database table includes less than all of the rows of first database table, and (ii) each of the multiple rows of a subset of the second database table; (b) comparing, in a comparison engine, the generated indicia for each of the multiple rows of the subset of the first database table with the generated indicia for each of the multiple rows of the subset of the second database table on a row-by-row basis, wherein the comparing occurs without regard to an original order of the rows in either the first database table or the second database table; (c) identifying, by the comparison engine, any generated indicia for each of the multiple rows of the subset of the first database table that are not present in the generated indicia for each of the multiple rows of the subset of the second database table, thereby identifying a first set of mismatched indicia; (d) identifying, by the comparison engine, any generated indicia for each of the multiple rows of the subset of the second database table that are not present in the generated indicia for each of the multiple rows of the subset of the first database table, thereby identifying a second set of mismatched indicia; and (e) populating the table of differences with the generated indicia identified in steps (c) and (d).
    • Claim:
      2. The method of claim 1 further comprising: (f) repairing the second database table by using at least one of the first set of mismatched indicia and the second set of mismatched indicia.
    • Claim:
      3. The method of claim 2 wherein the repairing occurs by using the first set of mismatched indicia to insert or update corresponding rows from the first database table into the second database table.
    • Claim:
      4. The method of claim 2 wherein the repairing occurs by using the second set of mismatched indicia to delete corresponding rows from the second database table.
    • Claim:
      5. The method of claim 1 wherein the generated indicia is a hash.
    • Claim:
      6. The method of claim 5 wherein the hash is one of CRC, MD5, SHA-1, and SHA-2.
    • Claim:
      7. The method of claim 1 wherein each of the multiple rows to be compared includes related data for the rows to be compared.
    • Claim:
      8. The method of claim 7 wherein the related data includes child rows for each parent row being compared between two parent tables.
    • Claim:
      9. The method of claim 1 wherein the comparing includes sorting the generated indicia prior to the comparing.
    • Claim:
      10. The method of claim 1 wherein the table of differences is in memory.
    • Claim:
      11. The method of claim 1 wherein the comparing is followed by a byte-by-byte comparison of the data used to generate the indicia.
    • Claim:
      12. The method of claim 1 further comprising, prior to step (e): (f) comparing additional subsets of data that are intended to match each other to create additional records in the table of differences; and (g) matching records in the table of differences across subsets to identify and eliminate records in the table of differences which match records in the table of differences from other subsets.
    • Claim:
      13. The method of claim 1 wherein each row has multiple columns, and wherein the indicia generated for each of the multiple rows is generated using a subset of the columns of the rows.
    • Claim:
      14. The method of claim 1 wherein the subset of the first database table is one or more partitions of the first database table, and the subset of the second database table is one or more partitions of the second database table.
    • Claim:
      15. An automated method for comparing a first database table and a second database table with each other in order to repair the second database table, wherein the first database table and the second database table include one or more subsets of data that are intended to match each other, the first database table and the second database table each including multiple rows, the method comprising: (a) generating, by a processor, indicia for: (i) each of the multiple rows of a subset of the first database table, wherein the subset of the first database table includes less than all of the rows of first database table, and (ii) each of the multiple rows of a subset of the second database table; (b) comparing, in a comparison engine, the generated indicia for each of the multiple rows of the subset of the first database table with the generated indicia for each of the multiple rows of the subset of the second database table on a row-by-row basis, wherein the comparing occurs without regard to an original order of the rows in either the first database table or the second database table; (c) identifying, by the comparison engine, any generated indicia for each of the multiple rows of the subset of the first database table that are not present in the generated indicia for each of the multiple rows of the subset of the second database table, thereby identifying a first set of mismatched indicia; (d) identifying, by the comparison engine, any generated indicia for each of the multiple rows of the subset of the second database table that are not present in the generated indicia for each of the multiple rows of the subset of the first database table, thereby identifying a second set of mismatched indicia; and (e) repairing the second database table by using at least one of the first set of mismatched indicia and the second set of mismatched indicia.
    • Claim:
      16. The method of claim 15 wherein the repairing occurs immediately upon identifying mismatched indicia in either the first or second set of mismatched indicia.
    • Claim:
      17. The method of claim 15 wherein the repairing occurs by using the first set of mismatched indicia to insert or update corresponding rows from the first database table into the second database table.
    • Claim:
      18. The method of claim 15 wherein the repairing occurs by using the second set of mismatched indicia to delete corresponding rows from the second database table.
    • Claim:
      19. The method of claim 15 wherein each row has multiple columns, and wherein the indicia generated for each of the multiple rows is generated using a subset of the columns of the rows.
    • Claim:
      20. The method of claim 15 wherein the subset of the first database table is one or more partitions of the first database table, and the subset of the second database table is one or more partitions of the second database table.
    • Claim:
      21. An apparatus comprising: (a) first memory configured to store a first database table and a second database table, wherein the first database table and the second database table include one or more subsets of data that are intended to match each other, the first database table and the second database table each including multiple rows; (b) second memory configured to store a table of differences that stores differences between the first database table and the second database table; (c) a first processor configured to generate indicia for: (i) each of the multiple rows of a subset of the first database table, wherein the subset of the first database table includes less than all of the rows of first database table, and (ii) each of the multiple rows of a subset of the second database table; and (d) a comparison engine including a second processor configured to: (i) compare the generated indicia for each of the multiple rows of the subset of the first database table with the generated indicia for each of the multiple rows of the subset of the second database table on a row-by-row basis, wherein the comparing occurs without regard to an original order of the rows in either the first database table or the second database table, (ii) identify any generated indicia for each of the multiple rows of the subset of the first database table that are not present in the generated indicia for each of the multiple rows of the subset of the second database table, thereby identifying a first set of mismatched indicia, and (iii) identify any generated indicia for each of the multiple rows of the subset of the second database table that are not present in the generated indicia for each of the multiple rows of the subset of the first database table, thereby identifying a second set of mismatched indicia, wherein the table of differences is populated with the generated indicia.
    • Claim:
      22. The apparatus of claim 21 wherein the generated indicia is a hash.
    • Claim:
      23. The apparatus of claim 21 wherein each of the multiple rows to be compared includes related data for the rows to be compared.
    • Claim:
      24. The apparatus of claim 21 wherein the comparing includes sorting the generated indicia prior to the comparing.
    • Claim:
      25. The apparatus of claim 21 wherein the comparison engine including the second processor is further configured to: (iv) compare additional subsets of data that are intended to match each other to create additional records in the table of differences, and (v) match records in the table of differences across subsets to identify and eliminate records in the table of differences which match records in the table of differences from other subsets.
    • Claim:
      26. The apparatus of claim 21 wherein each row has multiple columns, and wherein the indicia generated for each of the multiple rows is generated using a subset of the columns of the rows.
    • Claim:
      27. The apparatus of claim 21 wherein the subset of the first database table is one or more partitions of the first database table, and the subset of the second database table is one or more partitions of the second database table.
    • Claim:
      28. An apparatus comprising: (a) first memory configured to store a first database table and a second database table, wherein the first database table and the second database table include one or more subsets of data that are intended to match each other, the first database table and the second database table each including multiple rows; (b) a first processor configured to generate indicia for: (i) each of the multiple rows of a subset of the first database table, wherein the subset of the first database table includes less than all of the rows of first database table, and (ii) each of the multiple rows of a subset of the second database table; (c) a comparison engine including a second processor configured to: (i) compare the generated indicia for each of the multiple rows of the subset of the first database table with the generated indicia for each of the multiple rows of the subset of the second database table on a row-by-row basis, wherein the comparing occurs without regard to an original order of the rows in either the first database table or the second database table, (ii) identify any generated indicia for each of the multiple rows of the subset of the first database table that are not present in the generated indicia for each of the multiple rows of the subset of the second database table, thereby identifying a first set of mismatched indicia, and (iii) identify any generated indicia for each of the multiple rows of the subset of the second database table that are not present in the generated indicia for each of the multiple rows of the subset of the first database table, thereby identifying a second set of mismatched indicia; and (d) a repair engine configured to repair the second database table by using at least one of the first set of mismatched indicia and the second set of mismatched indicia.
    • Claim:
      29. The apparatus of claim 28 further comprising: (e) second memory configured to store a table of differences that stores differences between the first database table and the second database table, wherein the table of differences is populated with the generated indicia, and wherein the repair engine is configured to repair the second database table by further using the generated indicia stored in the table of differences.
    • Claim:
      30. The apparatus of claim 28 wherein each row has multiple columns, and wherein the indicia generated for each of the multiple rows is generated using a subset of the columns of the rows.
    • Claim:
      31. The apparatus of claim 28 wherein the subset of the first database table is one or more partitions of the first database table, and the subset of the second database table is one or more partitions of the second database table.
    • Patent References Cited:
      6317766 November 2001 Grover
      8903779 December 2014 Holenstein et al.
      9020987 April 2015 Nanda
      9804935 October 2017 Holenstein et al.
      9830223 November 2017 Holenstein et al.
      10152506 December 2018 Hoffmann et al.
      11120047 September 2021 Hoffmann et al.
      12019611 June 2024 Jarema
      2004/0138921 July 2004 Broussard
      2009/0024656 January 2009 Wellman
      2009/0037488 February 2009 Abrams
      2015/0087392 March 2015 Guerrero
      2016/0275150 September 2016 Bournonnais
      2016/0378752 December 2016 Anderson
      2018/0276216 September 2018 Anderson
    • Other References:
      Wikipedia entry for “Rsync.” Downloaded from: Last edited date: May 13, 2022, original posting date: unknown, 9 pages. cited by applicant
    • Primary Examiner:
      Mamillapalli, Pavan
    • Attorney, Agent or Firm:
      Jablon, Clark A.
    • Accession Number:
      edspgr.12271367