TcpServer Refactoring: Architecture Decomposition Guide

Alex Johnson
-
TcpServer Refactoring: Architecture Decomposition Guide

In this article, we'll dive deep into the refactoring of a TcpServer architecture, focusing on decomposition strategies and improvements. This is based on a detailed design document aimed at transforming a monolithic TcpServer into a set of focused, testable components. We'll explore the existing architecture, identify its problems, propose a new architecture, and discuss the migration strategy. Let's get started!

1. Introduction: Why Refactor TcpServer?

In today's software development landscape, maintainability, testability, and scalability are paramount. When it comes to server architectures, these qualities become even more critical. Our journey begins with understanding the need for refactoring. The existing TcpServer implementation has grown significantly, exceeding 600 lines of code. This monolithic structure violates the Single Responsibility Principle, a core tenet of good software design. The current TcpServer handles a multitude of tasks, including network I/O, request dispatch, command handling, auto-dump scheduling, and dump file lifecycle management. This complexity makes the code difficult to understand, test, and maintain.

1.1 The Goals of Refactoring

The primary goals of this refactoring effort are to:

  • Achieve Separation of Concerns: Each component should have a single, well-defined responsibility. This makes the code easier to understand and modify.
  • Enhance Testability: Components should be testable independently without the need for a live network infrastructure. This allows for faster and more reliable testing.
  • Improve Maintainability: Classes should ideally be less than 200 lines of code, making them easier to comprehend and manage. Smaller classes are often more focused and less prone to errors.
  • Eliminate Side Effects: The presentation layer should not modify the server's state. This ensures that formatting and displaying data don't inadvertently change the underlying data.

1.2 Non-Goals: What We're Not Trying to Do

It’s equally important to define what we are not trying to achieve during this refactoring process. This helps to maintain focus and prevent scope creep:

  • Avoid Changing External API or Protocol: The refactoring should not introduce breaking changes to the external interfaces or protocols used by the TcpServer. This ensures that existing clients can continue to interact with the server without modification.
  • Maintain Current Performance: The refactoring should not aim for performance optimizations at this stage. The goal is to maintain the current performance levels, with optimization efforts potentially addressed in future iterations.
  • Avoid Adding New Features: The focus is purely on refactoring the existing code. New features should not be introduced during this process to avoid unnecessary complexity and risk.

2. Current Architecture Analysis: Understanding the Existing Structure

Before diving into the refactoring, it's crucial to thoroughly analyze the current architecture. This involves understanding the responsibilities of the existing components and identifying any potential issues. The current TcpServer class handles seven distinct responsibilities, making it a prime candidate for decomposition.

2.1 TcpServer Responsibility Breakdown

The existing TcpServer class handles a wide range of responsibilities, including:

  1. Network Socket Management (150 lines): Managing the server's socket, accepting client connections, and handling client disconnections. This includes setting socket options and managing the low-level network communication.
  2. Request Processing & Dispatch (100 lines): Receiving client requests, parsing them, and dispatching them to the appropriate handlers. This involves interpreting the client's commands and directing them to the correct processing logic.
  3. Command Handler Lifecycle (50 lines): Managing the lifecycle of command handlers, such as search, document, dump, admin, replication, and debug handlers. This includes creating, initializing, and destroying these handlers.
  4. Auto-Dump Scheduling (150 lines): Scheduling automatic dumps of the server's data, which includes starting, stopping, and managing the auto-save thread, as well as cleaning up old dumps.
  5. Connection Pool Management (80 lines): Managing a pool of active client connections, including tracking and managing connection contexts.
  6. Thread Pool Management: Utilizing a thread pool to handle client requests concurrently, improving the server's performance and responsiveness.
  7. Server State Management: Managing the server's state, such as whether it's running, in read-only mode, or loading data. This also includes managing server statistics.
class TcpServer {
  // 1. Network Socket Management (150 lines)
  int server_fd_;
  void AcceptThreadFunc();
  void HandleClient(int fd);
  bool SetSocketOptions(int fd);

  // 2. Request Processing & Dispatch (100 lines)
  std::string ProcessRequest(...);
  query::QueryParser query_parser_;

  // 3. Command Handler Lifecycle (50 lines)
  std::unique_ptr<CommandHandler> search_handler_;
  std::unique_ptr<CommandHandler> document_handler_;
  std::unique_ptr<CommandHandler> dump_handler_;
  std::unique_ptr<CommandHandler> admin_handler_;
  std::unique_ptr<CommandHandler> replication_handler_;
  std::unique_ptr<CommandHandler> debug_handler_;

  // 4. Auto-Dump Scheduling (150 lines)
  void StartAutoSave();
  void StopAutoSave();
  void AutoSaveThread();
  void CleanupOldDumps();

  // 5. Connection Pool Management (80 lines)
  std::set<int> connection_fds_;
  std::unordered_map<int, ConnectionContext> connection_contexts_;

  // 6. Thread Pool Management
  std::unique_ptr<ThreadPool> thread_pool_;

  // 7. Server State Management
  ServerStats stats_;
  std::atomic<bool> running_;
  std::atomic<bool> read_only_;
  std::atomic<bool> loading_;
};

2.2 Current Metrics: Assessing the Complexity

To quantify the complexity of the current implementation, we can look at several metrics:

  • Total Lines of Code: The tcp_server.cpp file contains 625 lines, and the tcp_server.h file adds another 238 lines, for a total of 863 lines of code. This substantial size makes the class difficult to navigate and understand.
  • Cyclomatic Complexity: The cyclomatic complexity is estimated to be around 45, indicating a high level of decision points and branching within the code. This makes the code harder to test and maintain.
  • Dependencies: The class has more than 20 includes, indicating a large number of external dependencies. This can lead to increased coupling and reduced modularity.
  • Private Methods: The class has 14 private methods, suggesting a complex internal structure and a large number of responsibilities.

2.3 ResponseFormatter Layer Violation: Identifying Architectural Issues

One significant architectural issue is the violation of layer separation in the ResponseFormatter::FormatInfoResponse() method. This method performs three distinct tasks:

  1. Aggregates state from multiple tables (domain logic).
  2. Updates server statistics (state mutation).
  3. Formats response strings (presentation logic).

This mixing of responsibilities violates the principle of separation of concerns and makes the code harder to test and maintain.

2.3.1 Problematic Code

The following code snippet illustrates the issue:

// response_formatter.cpp:121-317
std::string ResponseFormatter::FormatInfoResponse(
    const std::unordered_map<std::string, TableContext*>& table_contexts,
    ServerStats& stats,  // Non-const reference = side effects
    mysql::BinlogReader* binlog_reader
) {
  // Lines 180-202: Domain logic - state aggregation
  size_t total_index_memory = 0;
  size_t total_doc_memory = 0;
  for (const auto& [table_name, ctx] : table_contexts) {
    total_index_memory += ctx->index->MemoryUsage();
    total_doc_memory += ctx->doc_store->MemoryUsage();
    // ... more aggregation
  }

  // Line 210: Side effect - state mutation in formatter!
  stats.UpdateMemoryUsage(total_memory);

  // Lines 212-317: Presentation logic
  oss << "used_memory_bytes: " << total_memory << "\r\n";
  // ...
}

The ResponseFormatter should ideally focus solely on formatting the response string. However, it also aggregates metrics and updates server statistics, leading to side effects. The non-const reference to ServerStats indicates a state mutation within the formatter, which is a clear violation of layer separation.

2.3.2 Layer Violation Diagram

The following diagram illustrates the layer violation:

graph TB
    subgraph presentation["Presentation Layer (ResponseFormatter)"]
        format["Format strings<br/>Layout data"]
        violation["❌ stats.UpdateMemoryUsage() ← VIOLATION<br/>Should NOT be here!"]
    end

    subgraph domain["Domain/Service Layer (StatisticsService)"]
        aggregate["Aggregate metrics<br/>Update statistics ← Should be here<br/>Business logic"]
    end

    violation -.->|"Side effect<br/>crosses layer"| aggregate

    style violation fill:#ffcccc,stroke:#ff0000,stroke-width:2px
    style aggregate fill:#ccffcc,stroke:#00ff00,stroke-width:2px

2.4 Table Resource Management Duplication

Another issue is the duplication of table context conversion logic in multiple locations. The same pattern of converting table contexts appears in dump_handler.cpp and tcp_server.cpp.

2.4.1 Duplicated Patterns

The following code snippets illustrate the duplication:

Pattern 1: dump_handler.cpp:107-111

std::unordered_map<std::string, std::pair<index::Index*, storage::DocumentStore*>> converted_contexts;
for (const auto& [table_name, table_ctx] : ctx_.table_contexts) {
  converted_contexts[table_name] = {table_ctx->index.get(), table_ctx->doc_store.get()};
}
bool success = storage::dump_v1::WriteDumpV1(filepath, gtid, *ctx_.full_config, converted_contexts);

Pattern 2: tcp_server.cpp:553-556 (AutoSaveThread)

std::unordered_map<std::string, std::pair<index::Index*, storage::DocumentStore*>> converted_contexts;
for (const auto& [table_name, table_ctx] : table_contexts_) {
  converted_contexts[table_name] = {table_ctx->index.get(), table_ctx->doc_store.get()};
}
bool success = storage::dump_v1::WriteDumpV1(dump_path.string(), gtid, *full_config_, converted_contexts);

Pattern 3: dump_handler.cpp:143-146

std::unordered_map<std::string, std::pair<index::Index*, storage::DocumentStore*>> converted_contexts;
for (const auto& [table_name, table_ctx] : ctx_.table_contexts) {
  converted_contexts[table_name] = {table_ctx->index.get(), table_ctx->doc_store.get()};
}
bool success = storage::dump_v1::ReadDumpV1(filepath, gtid, loaded_config, converted_contexts, ...);

2.4.2 Problem: Lack of Encapsulation

The problem here is the lack of an encapsulation boundary around table resource management. The raw unordered_map<string, TableContext*> is directly exposed in the HandlerContext, leading to duplication and potential inconsistencies.

3. Problem Identification: Key Issues in the Current Design

Having analyzed the current architecture, we can identify three major problems:

3.1 Problem 1: TcpServer

You may also like