
You initially developed a database system to manage patient records for a pilot project encompassing a few hundred patients across three clinics. The design utilized a single table to store all patient details and their respective visits, using self-joins for reporting purposes. At that time, server resource utilization was at 50%. However, the project's scope has since expanded significantly, requiring the database to accommodate a hundredfold increase in patient records. Currently, the reporting functions are failing due to excessive runtime or insufficient computational resources. What modifications should you make to the database design to efficiently handle the expanded dataset?
A
Add capacity (memory and disk space) to the database server by the order of 200.
B
Shard the tables into smaller ones based on date ranges, and only generate reports with prespecified date ranges.
C
Normalize the master patient-record table into the patient table and the visits table, and create other necessary tables to avoid self-join.
D
Partition the table into smaller tables, with one for each clinic. Run queries against the smaller table pairs, and use unions for consolidated reports.
Explanation:
The correct answer is C. Normalizing the database design by breaking the master patient-record table into separate tables (patient and visits) and eliminating self-joins will improve performance and make the database more scalable. This approach reduces data redundancy and improves data integrity, which are critical as the dataset grows significantly. Other options either do not address the core issue of performance due to self-joins or introduce new limitations and complexities.
Ultimate access to all questions.