Software Architecture Books Summary And Highlights -- Part 9 Data And Database

- January 15, 2023

Data And Database

Highlights

When to break database into multiple one?

Change control
- How many services are impacted by a database table change? if one table is depended by too many services, then it is hard to decompose that table
- Better to refactor database schema to bounded context first so that one table can only be directly accessed by only one service
Connection management
- breaking into multiple services may increase number of connections to db.
- One solution is to assign connection quota to each service either evenly or based on different services
Scalability, Fault tolerant
- breaking into multiple database can help increase scalability and fault tolerant
Database type optimization
- Can put some data type in a more suitable type of database

SAH Ch 6 Pulling Apart Operational Data

How to split table ownership among multiple services

Several scenarios:

single ownership scenario: one table is written by only one service; straight forward…
common ownership: one table is written by all services.
1. Create a single wrapper service above that table
Joint ownership: one table is written by some but not all services; 4 solutions
1. Split db for each service
2. allow multiple services to access same db; define a domain db and let multiple services own that
3. Delegate all writes and read to one service. That service can be picked either by
  1. which service has closer relationship to the data
  2. which service has higher performance requirements
4. Combine multiple services into one; may affect scalability

SAH Ch 9 Data Ownership and Distributed Transactions

When to merge multiple database into single one?

Data relationships Are there foreign keys, triggers, or views that form close relationships between the tables? these are important for data consistency
Database transactions Is a single transactional unit of work necessary to ensure data integrity and consistency?

SAH Ch 6 Pulling Apart Operational Data

Distributed Data Access Pattern

Access data through interservice communication
- Pro: simple
- Cons: low scalability, throughput, availability
Replicate Database’s column data to each service’s own database
- Pros:
  - good data access performance and scalability, fault tolerance;
  - no service dependency
- Cons: data consistency, ownership and synchronization challenge
Access data through cache
- Pros:
  - good performance and consistency;
  - Ownership is preserved
- Cons:
  - hard to configure
  - Not scalable for high data volumes or high update rates
Multiple services share same database
- Pros:
  - good performance and consistency;
  - no service dependency
- Cons: bounded context, data ownership and data access security issues are challenging

SAH Ch 10 Distributed Data Access

Managing Analytical Data

Data warehouse

cons:

Integration brittleness
- Changing production db schema will entails changes of transformation and import logic also
Extreme partitioning of domain knowledge
- Couple all domain together and Architects, developers, DBAs, and data scientists must all coordinate on data changes and evolution, forcing tight coupling between vastly different parts of the ecosystem.
Complexity
Synchronization creates bottlenecks
Limited functionality for intended purpose
- most data warehouses failed because they didn’t deliver business value commensurate to the effort required to create and maintain the warehouse.

Data Lake

do no transformations, allowing business users access to analytical data in its natural format, which typically required transformation and massaging for their purpose.

Load and transform instead of transform and load

Cons

Difficulty in discovery of proper assets
Still technically partitioned instead of partition based on domain

Data Mesh

principles:

Domain ownership of data
1. Data is owned and shared by the domains that are most intimately familiar with the data:
Data as a product
1. puts in place the organizational roles and success metrics necessary to ensure that domains provide their data in a way that delights the experience of data consumers across the organization.
Self-serve data platform
1. make developer life easier
Computational federated governance
1. organization-wide governance requirements—such as compliance, security, privacy, and quality of data, as well as interoperability of data products—are met consistently across all domains.

SAH Ch 14 Managing Analytical Data

Reporting

Several ways:

Export data for reporting through api, e.g. batch api, or let api write to a file
separate data sync code but owned by domain team. good decoupling
send data change event to reporting service. hard to scale
build reporting data based on backup data

MSV Ch 5 Splitting the monolith

Related Chapters

SAH Ch 6 Pulling Apart Operational Data

SAH Ch 9 Data Ownership and Distributed Transactions

SAH Ch 10 Distributed Data Access

SAH Ch 14 Managing Analytical Data

MSV Ch 5 Splitting the monolith

Search This Blog

Swortal

Software Architecture Books Summary And Highlights -- Part 9 Data And Database

Data And Database

Highlights

Related Chapters

Popular posts from this blog

Software Architecture Books Summary And Highlights -- Part 1 Goal, Introduction And Index

拉美500年，荆棘丛生的自由繁荣之路

以小见大，从国父的故事窥见美国独立建国的大历史