EDI System Guide: 204 Transactions & Implementation

Aug 11, 2025 by Luna Greco 52 views

EDI System Discussion: A Comprehensive Guide

Hey guys! Let's dive deep into the world of EDI (Electronic Data Interchange) systems. We're going to break down the goals, scope, architecture, data structures, parsing flow, and so much more. This guide is designed to give you a solid understanding of how EDI systems work, especially in the context of ingesting 204s (Motor Carrier Load Tender) reliably and efficiently.

Goals of the EDI System

Our main goal here is to build an EDI system that can reliably ingest 204s, map them to shipments, and validate the data. This is crucial for streamlining logistics operations and ensuring data accuracy. But there's more to it than just that. We want the system to be highly configurable, allowing us to adapt to different trading partners without having to dive into the code every time.

Reliable Ingestion and Mapping: The core of our EDI system is its ability to ingest 204 transactions accurately and map the data to our internal shipment records. This involves a robust parsing mechanism that can handle various EDI formats and versions. The mapping process needs to be flexible enough to accommodate the specific requirements of different trading partners. We're talking about turning raw EDI data into actionable shipment information, which is key to efficient logistics management. Think about it – every 204 contains vital details about a load, and our system needs to pluck out those details and slot them into the right places in our shipment database.
Configurability: One of the biggest challenges in EDI is dealing with the diverse standards and formats used by different trading partners. Our system must be configurable per trading partner, meaning we can adjust how we process EDI data based on who sent it. This configurability extends to field mappings, validation rules, and even the handling of specific data elements. The goal is to avoid hardcoding partner-specific logic into the system, which would make maintenance and scaling a nightmare. Instead, we use configuration files or a database to store partner-specific settings, allowing us to add new partners or modify existing configurations without code changes.
Self-Service Partner Onboarding and Monitoring: To make things even smoother, we're aiming for a self-service model where partners can onboard themselves and monitor their EDI transactions. This means providing tools and interfaces that allow partners to set up their profiles, upload sample EDI files, and track the status of their messages. Self-service onboarding not only reduces the burden on our IT team but also empowers partners to manage their EDI connections more effectively. Monitoring is equally important – partners need to be able to see whether their messages are being processed correctly, identify any errors, and take corrective action. This level of transparency and control enhances partner satisfaction and reduces support requests.

Initial Scope: Focusing on Inbound 204 Transactions

For the initial rollout, we're laser-focused on inbound X12 204 transactions. X12 is a common EDI standard, and the 204 transaction specifically deals with load tenders. We're likely targeting the 004010 version, but we'll need to confirm this. The goal is to parse these transactions, extract the relevant information, and create Shipment records in our system. This includes identifying key details such as customer information, reference IDs, stops, dates, and equipment.

Inbound X12 204 (004010): The starting point is to handle X12 204 transactions. This transaction set is the bread and butter of transportation EDI, as it communicates the details of a load tender from a shipper to a carrier. Focusing on a specific version (like 004010) allows us to narrow our development efforts and ensure compatibility with a widely used standard. The system needs to be able to handle the nuances of the 204 transaction, including its segments, loops, and data elements. This involves not only parsing the data but also understanding the business context behind each element. For example, knowing the difference between a shipper and a consignee, or how to interpret various date and time codes.
Parsing ISA/GS/ST Envelopes: EDI transactions are typically wrapped in envelopes, which provide routing and control information. The ISA (Interchange Control Header), GS (Functional Group Header), and ST (Transaction Set Header) segments form the layers of this envelope. Our system must be able to peel back these layers, validating the structure and control numbers to ensure the integrity of the transaction. Think of it like opening a series of nested boxes – each layer contains important information that guides the processing of the message. The ISA segment, for instance, contains the sender and receiver IDs, while the GS segment specifies the type of transaction. Accurate envelope parsing is the foundation for processing any EDI transaction.
Mapping to Shipment: Once we've parsed the 204 transaction, the next step is to map the data to our internal Shipment model. This involves extracting key fields and transforming them into a format that our system understands. The Shipment model might include attributes such as customer, reference IDs, stops, dates, equipment, and special services. Mapping is not always a one-to-one process; it may require data transformations, lookups, and business logic. For example, we might need to convert EDI date codes into a standard date format, or look up a customer ID based on an EDI qualifier. A flexible and configurable mapping engine is essential for handling the variations in EDI data from different partners.
Basic Compliance Validation: Validation is crucial to ensure the quality of the data. We need to implement basic compliance checks to catch common errors and inconsistencies in the EDI data. This includes verifying required segments, validating code sets, and checking data formats. For example, we might check that a required segment is present, that a date field is in the correct format, or that a code value is valid. Actionable errors are key – when a validation fails, the system should provide clear and informative messages that help users understand the problem and how to fix it. This might involve highlighting the specific segment or data element that caused the error.
CLI and Go API: To support local development and service integration, we're providing both a Command Line Interface (CLI) and a Go API. The CLI allows developers to test and debug EDI processing locally, while the Go API enables us to integrate the EDI system into our existing services. The CLI might provide commands for parsing EDI files, validating transactions, and mapping data. The Go API would expose functions and data structures that other Go programs can use to interact with the EDI system. This dual approach gives us the flexibility to use the EDI system in a variety of contexts.

EDI System Architecture: Building Blocks for Success

The architecture of our EDI system is designed to be modular and extensible. We're breaking it down into several key components, each responsible for a specific aspect of EDI processing. This includes a parser core, a 204 transaction layer, a mapping engine, a validation engine, and trading partner profiles. Each of these components plays a crucial role in ensuring the smooth and accurate processing of EDI data.

Parser Core: The parser core is the foundation of our EDI system. It's responsible for the low-level details of parsing EDI data, including delimiter detection, tokenization, segment parsing, and composite element support. Think of it as the engine that takes raw EDI text and turns it into a structured representation. Delimiter detection is the first step – identifying the characters that separate segments, elements, and components within the EDI message. Tokenization then breaks the input stream into individual tokens, which are the building blocks of EDI data. Segment parsing involves grouping these tokens into segments, which are the basic units of information in an EDI transaction. Composite element support allows us to handle complex data structures within elements. The parser core needs to be efficient and robust, as it handles the most fundamental aspect of EDI processing.
204 Transaction Layer: This layer is where we define the typed model for the 204 transaction. It provides a structured representation of the 204 data, including loops and segments such as N1, S5, and LX. The 204 transaction layer builds on the parser core by interpreting the raw EDI data in the context of the 204 transaction set. A typed model means that we define specific data structures in our programming language (Go, in this case) to represent the elements and segments of the 204. This makes it easier to work with the data and reduces the risk of errors. Loops, such as the N1 loop for party information and the S5 loop for stops, are a key feature of EDI transactions. The 204 transaction layer provides a way to navigate and process these loops in a structured manner.
Mapping Engine: The mapping engine is responsible for transforming EDI data into our internal Shipment domain model. It uses partner-configured field mappings to determine how data elements in the 204 transaction correspond to fields in our Shipment model. This is where the rubber meets the road – we're taking the structured EDI data and turning it into something that our business systems can understand. The mapping engine needs to be flexible enough to handle the variations in EDI formats and data structures from different trading partners. Partner-configured mappings mean that we can define these transformations on a per-partner basis, typically using a declarative configuration format like YAML or JSON. This allows us to adapt to different EDI standards and partner-specific requirements without changing the code.
Validation Engine: The validation engine ensures that the EDI data meets our compliance requirements. It applies rules per version and partner, checking for required segments, code sets, and formats. Validation is critical for data quality – it helps us catch errors early in the process and prevent bad data from entering our systems. The validation engine needs to be configurable so that we can define different rules for different EDI versions and trading partners. Rules might include checks for required segments (e.g., ensuring that the B2 segment is present in a 204 transaction), code set validation (e.g., verifying that a transportation method code is valid), and format validation (e.g., checking that a date field conforms to the EDI date format). Actionable error messages are essential – when a validation rule fails, the engine should provide clear and informative messages that help users understand the problem and how to fix it.
Trading Partner Profiles: Trading partner profiles store configuration information for each partner, including their EDI version, delimiters, usage, custom mappings, and tolerances. These profiles are the key to our system's configurability – they allow us to tailor EDI processing to the specific requirements of each partner. The profile might include information such as the EDI version (e.g., 004010 or 005010), the delimiters used in their EDI messages (e.g., the segment terminator and element separator), and any custom mappings or validation rules that apply to this partner. Tolerances define how strictly we should enforce EDI standards – for example, whether we should allow optional segments or be lenient with code set validation. By centralizing this configuration information in trading partner profiles, we can manage EDI processing more efficiently and adapt to changes in partner requirements more easily.
Envelopes: EDI messages are structured in a hierarchical envelope system: Interchange → Group → Transaction. Our architecture must handle this hierarchy, validating control numbers at each level to ensure message integrity. This is a bit like the postal service for EDI – envelopes contain the routing and control information that guides the message through the system. The Interchange envelope (ISA) is the outermost layer, containing information about the sender and receiver. The Group envelope (GS) groups related transactions together. The Transaction Set envelope (ST) contains the actual EDI transaction, such as a 204. Control numbers are used to track messages and prevent duplicates – each envelope has a unique control number that we can use to ensure that messages are processed exactly once. Validating these control numbers is a critical part of ensuring the reliability of our EDI system.
Extensibility: We're designing the system with future expansion in mind. It should be extensible to handle other transaction sets like 990 (Response to a Load Tender), 997/999 (Functional Acknowledgements), 214 (Shipment Status), and 210 (Freight Invoice). This means building a modular architecture that allows us to add new transaction set handlers without disrupting existing functionality. Extensibility is key to the long-term viability of our EDI system – as our business needs evolve, we'll need to support a wider range of EDI transactions. By designing for extensibility from the start, we can avoid costly refactoring and ensure that our system remains adaptable to future requirements.

Data Structures: The Building Blocks of EDI Information

To effectively manage EDI data, we need well-defined data structures. This includes generic structures for X12 and specific typed models for 204 transactions. These structures help us represent EDI information in a way that's easy to work with and understand.

Generic X12 Structures: These are the basic building blocks for any X12 transaction. They include structures for Interchange, Group, TransactionSet, Segment, and Element. A Segment consists of a name and an array of Elements, while an Element contains a value and an array of Components. Think of these as the raw materials of EDI – the fundamental data types that we use to represent any X12 message. The Interchange structure represents the ISA envelope, the Group structure represents the GS envelope, and the TransactionSet structure represents the ST envelope. A Segment is a line of data in the EDI message, such as B2 or N1. Each Segment is composed of Elements, which are the individual data fields within the segment. Elements can be simple values or composite structures containing Components. By defining these generic structures, we create a foundation for parsing and processing any X12 transaction.
204 Typed Model: This is a specific data structure tailored to the 204 transaction. It includes structures for Header (B2, B2A, L11/REF), Parties (N1 loop), Stops (S5 loop + N1/AT5/DTM/LX/OID), Equipment (N7), Notes (NTE), Dates (DTM), and Charges (L3 if present). This typed model makes it easier to work with 204 data because it provides a structured representation of the transaction's specific elements. The Header structure contains information from segments like B2 (Beginning Segment for Shipment Information) and L11 (Reference Information). The Parties structure represents the various parties involved in the shipment, such as the shipper and consignee. The Stops structure represents the pickup and delivery locations, along with associated information like dates and times. The Equipment structure describes the equipment used for the shipment, such as the trailer type. By defining this typed model, we create a clear and consistent way to access and manipulate 204 data.
Domain Model (Shipment): This represents our internal view of a shipment, including information such as customer, load references, pickup/delivery stops, appointments, equipment, accessorials, and notes. The Domain Model is the bridge between the EDI world and our internal business systems. It defines the data structures that we use to represent shipments within our application. The customer field identifies the customer associated with the shipment. Load references are identifiers like purchase order numbers and pro numbers. Pickup and delivery stops represent the locations where the shipment will be picked up and delivered. Appointments specify the scheduled times for pickup and delivery. Equipment describes the type of equipment used for the shipment. Accessorials are additional services required for the shipment, such as liftgate service. Notes provide additional information about the shipment. By mapping EDI data to our Domain Model, we can seamlessly integrate EDI transactions into our business processes.

Parsing Flow: How We Process EDI Transactions

The parsing flow is the sequence of steps our system takes to process an EDI transaction. It starts with reading the ISA segment to detect separators, streams segments, validates the envelope structure, builds a TransactionSet AST, constructs a typed 204 model, and emits structured errors.

Read ISA for Separators: The first step is to read the ISA segment, which contains the delimiters used in the EDI message. These delimiters define how segments, elements, and components are separated. This is like finding the key to decode the rest of the message. The ISA segment specifies the segment terminator, the element separator, the component separator, and the repetition separator. These delimiters are crucial for parsing the EDI message correctly. For example, the segment terminator indicates the end of a segment, while the element separator distinguishes between the individual data fields within a segment. By reading the ISA segment first, we can dynamically detect the delimiters used in the message and adjust our parsing logic accordingly.
Stream Segments: Next, we stream the segments from the EDI file, validating the ISA/GS/ST structure and control numbers as we go. Streaming allows us to process large EDI files efficiently without loading the entire file into memory. As we stream each segment, we validate the envelope structure to ensure that the EDI message is well-formed. This includes checking that the ISA, GS, and ST segments are present and in the correct order. We also validate control numbers to prevent duplicate messages and ensure message integrity. This step is like checking the envelope of a letter to make sure it's properly addressed and hasn't been tampered with.
Build TransactionSet AST: We then build an Abstract Syntax Tree (AST) for the TransactionSet, preserving the order of segments. The AST is a tree-like representation of the EDI message that makes it easier to navigate and process the data. Preserving the order of segments is important because the sequence of segments can have semantic meaning in EDI. The AST provides a structured way to access and manipulate the EDI data, making it easier to implement mapping and validation logic. Think of the AST as a blueprint of the EDI message, showing the relationships between segments and elements.
Build Typed 204 Model: If the transaction is a 204, we build a typed 204 model by walking the segments and loops in the AST. This typed model provides a structured representation of the 204 data, making it easier to access and manipulate. This is where we take the generic EDI structure and turn it into something specific to the 204 transaction. The typed model includes structures for the header, parties, stops, equipment, and other key elements of the 204. By building a typed model, we can work with the 204 data in a more intuitive and type-safe manner.
Emit Structured Errors: Finally, we emit structured errors with segment position and example fixes. This makes it easier to identify and correct errors in the EDI data. Structured errors provide detailed information about the error, including the segment index, the position within the segment, and a description of the problem. Example fixes suggest how to correct the error, such as providing the correct code value or adding a missing segment. These structured errors are crucial for troubleshooting EDI processing and ensuring data quality.

204 Coverage (MVP): What We'll Handle Initially

For the Minimum Viable Product (MVP), we're focusing on the essential parts of the 204 transaction. This includes required segments, key parties, references, dates, stops, and equipment.

Required Segments: We'll handle the required segments such as ISA, GS, ST, B2, and SE/GE/IEA. These segments are the backbone of the EDI message and must be present for the transaction to be valid. The ISA, GS, and ST segments form the envelope, while the B2 segment contains basic shipment information. The SE, GE, and IEA segments are trailers that mark the end of the transaction, group, and interchange, respectively. Handling these required segments is the first step in processing any EDI transaction.
Parties (N1 Loop): We'll cover the N1 loop for parties such as BT (Bill-to), SH (Shipper), CN (Consignee), SF (Ship From), and ST (Ship To), including N3/N4 (address) and REF (reference) segments. The N1 loop is used to identify the different parties involved in the shipment. Each N1 segment represents a party, and the N101 element specifies the party type (e.g., shipper, consignee). The N3/N4 segments provide the address information for the party, while the REF segments provide additional reference numbers. Handling the N1 loop is crucial for identifying the key players in the shipment and their contact information.
References (L11, REF, G61): We'll handle reference information from segments like L11, REF, and G61 (contacts). These segments provide additional information about the shipment, such as purchase order numbers, pro numbers, and contact details. The L11 segment is a general-purpose reference segment, while the REF segment is used for specific types of references, such as the bill of lading number. The G61 segment provides contact information, such as the name and phone number of the contact person. Handling these reference segments allows us to capture important details about the shipment that may not be included in other segments.
Dates (DTM): We'll process dates from the DTM segment, including pickup and delivery windows. The DTM segment is used to specify dates and times for various events, such as the scheduled pickup and delivery times. Different DTM qualifiers indicate the type of date being specified, such as the requested pickup date or the estimated delivery date. Handling the DTM segment is crucial for coordinating the shipment and ensuring timely delivery.
Stops (S5 Loop): We'll handle the S5 loop with stop type, sequence, and nested N1/AT5/DTM/LX. The S5 loop represents the stops along the shipment route, including pickup and delivery locations. The stop type indicates whether the stop is a pickup or delivery. The sequence number specifies the order of the stops. Nested segments like N1 (party), AT5 (accessorial), DTM (date/time), and LX (line item) provide additional information about each stop. Handling the S5 loop is essential for routing the shipment and managing the stops along the way.
Equipment (N7): We'll cover equipment information from the N7 segment (type/number), possibly AT5 (accessorials), and HAZ if needed later. The N7 segment specifies the equipment used for the shipment, such as the trailer type and number. The AT5 segment may provide additional information about equipment-related accessorials, such as temperature control requirements. The HAZ segment is used to indicate whether the shipment contains hazardous materials. Handling the N7 segment allows us to track the equipment used for the shipment and ensure that the appropriate equipment is used.
Charges (L3): We'll handle charges from the L3 segment, which is optional for the MVP. The L3 segment provides information about the charges associated with the shipment, such as the freight charges and fuel surcharges. Handling the L3 segment allows us to calculate the total cost of the shipment.
Edge Cases: We'll address edge cases such as repeated REF/L11, multiple stops, and multiple LX line items (optional for MVP). These are less common scenarios that require special handling. Repeated REF/L11 segments may contain multiple reference numbers or descriptions. Multiple stops indicate that the shipment has more than one pickup or delivery location. Multiple LX line items may represent different items being shipped. Handling these edge cases ensures that our system can process a wide range of EDI transactions.

Trading Partner Configuration: Tailoring the System

To handle the variations between trading partners, we need a robust configuration system. This includes version and usage settings, mapping configurations, normalization rules, tolerances, and test fixtures.

Version and Usage: We'll configure the system for different EDI versions (e.g., 004010 vs 005010) and usage profiles, including optional segments and element codes. Different trading partners may use different versions of the EDI standard, so we need to be able to handle multiple versions. Usage profiles define which segments and elements are required or optional. For example, one partner may require a certain segment that another partner considers optional. Configuring the version and usage allows us to tailor the system to the specific requirements of each trading partner.
Mapping: We'll use declarative YAML/JSON mapping to define the transformation of 204 fields to Shipment fields, including transforms. This is where we specify how the data elements in the EDI message map to the fields in our internal Shipment model. Declarative mapping means that we define the mapping rules in a configuration file, rather than hardcoding them into the system. YAML and JSON are common formats for these configuration files. Transforms are functions that modify the data during the mapping process, such as converting a date format or looking up a code value. Configuring the mapping allows us to translate EDI data into a format that our system can understand.
Normalization: We'll implement code tables (equipment, accessorials, qualifiers) with partner overrides. Code tables define the valid values for certain data elements, such as the equipment type or accessorial code. Partner overrides allow us to customize these code tables for specific trading partners. For example, one partner may use a different code for a particular equipment type than another partner. Normalization ensures that the data is consistent and standardized, making it easier to process and analyze.
Tolerances: We'll configure strict/lenient modes, default values, and required field behavior. Tolerances define how strictly we should enforce the EDI standard. Strict mode means that we'll reject messages that don't conform to the standard, while lenient mode means that we'll try to process the message even if there are minor errors. Default values specify what value to use if a required field is missing. Required field behavior defines how we should handle missing required fields. Configuring tolerances allows us to balance data quality with processing efficiency.
Test Fixtures: We'll use test fixtures per partner for regression testing. Test fixtures are sample EDI messages that we use to test the system. By creating test fixtures for each trading partner, we can ensure that the system processes their messages correctly. Regression testing means that we run these tests whenever we make changes to the system, to ensure that we haven't introduced any new errors. Using test fixtures is crucial for maintaining the quality and reliability of our EDI system.

Validation: Ensuring Data Quality

Validation is a critical part of EDI processing. We need to implement structural, semantic, and cross-field validations, as well as partner overrides and rule waivers.

Structural Validation: This includes validating envelopes, control numbers, and counts. Structural validation ensures that the EDI message is well-formed and conforms to the basic EDI syntax rules. This includes checking that the ISA, GS, and ST segments are present and in the correct order, that the control numbers match, and that the segment counts are correct. Structural validation is the first line of defense against bad EDI data.
Semantic Validation: This includes validating required segments, qualifier/code sets, date/time formats, and numeric fields. Semantic validation ensures that the data values are valid and consistent with the EDI standard. This includes checking that required segments are present, that code values are valid, that dates and times are in the correct format, and that numeric fields are within the expected range. Semantic validation goes beyond syntax checking to ensure that the data makes sense.
Cross-Field Validation: This includes validating stop sequences, date ranges, and party presence per stop type. Cross-field validation ensures that the data values are consistent with each other. For example, we might check that the stop sequence numbers are in the correct order, that the delivery date is after the pickup date, and that the required parties are present for each stop type. Cross-field validation catches errors that would not be detected by structural or semantic validation alone.
Partner Overrides and Rule Waivers: We'll implement partner overrides and rule waivers to handle partner-specific exceptions. Some trading partners may have slightly different requirements or may deviate from the EDI standard in certain ways. Partner overrides allow us to customize the validation rules for specific partners. Rule waivers allow us to disable certain validation rules for specific partners or messages. These features provide the flexibility to handle the variations between trading partners.
Errors: We'll ensure errors are machine-parseable with segment index and path. When a validation rule fails, we need to provide clear and informative error messages. These error messages should include the segment index and path, so that users can easily identify the location of the error in the EDI message. Machine-parseable errors mean that the error messages are in a structured format that can be easily processed by a computer program. This allows us to automate error handling and reporting.

Mapping to Shipment: Transforming EDI Data

Mapping is the process of transforming EDI data into our internal Shipment model. This involves identifying keys, mapping parties, stops, equipment, and accessorials, and producing a Shipment DTO plus raw data for audit.

Keys: We'll use keys such as customer account, PO/pro numbers (L11/REF), and BOL/SHIPMENT IDs to identify shipments. These keys allow us to uniquely identify shipments in our system. The customer account identifies the customer associated with the shipment. PO/pro numbers are reference numbers used to track the shipment. BOL/SHIPMENT IDs are identifiers assigned to the shipment by the carrier. Using these keys, we can ensure that we're mapping the EDI data to the correct shipment record.
Parties: We'll map shipper/consignee/bill-to from N1 loops with address details. The N1 loop contains information about the parties involved in the shipment, such as the shipper, consignee, and bill-to party. We need to extract this information and map it to the corresponding fields in our Shipment model. This includes mapping the party name, address, and contact information.
Stops: We'll map stops from S5 loops, appointments from DTM, and notes from NTE. The S5 loop contains information about the stops along the shipment route. We need to extract this information and map it to the stops in our Shipment model. This includes mapping the stop type, location, and scheduled dates and times. Appointments are extracted from the DTM segments within the S5 loop. Notes are extracted from the NTE segments within the S5 loop. Mapping the stops accurately is crucial for routing the shipment correctly.
Equipment: We'll map type/size, temp-controlled flags, and HAZ flags (future). The equipment information is contained in the N7 segment. We need to extract this information and map it to the equipment fields in our Shipment model. This includes mapping the equipment type, size, and any special requirements, such as temperature control or hazardous materials handling. Mapping the equipment information allows us to ensure that the shipment is transported using the appropriate equipment.
Accessorials: We'll map AT5/REF/FOB to internal codes. Accessorials are additional services required for the shipment, such as liftgate service or inside delivery. The AT5, REF, and FOB segments may contain information about accessorials. We need to extract this information and map it to our internal codes for accessorials. This allows us to accurately capture the services required for the shipment.
Output: We'll output a Shipment DTO plus “raw envelope + segment trail” for audit. The Shipment DTO (Data Transfer Object) is a structured representation of the shipment data that we can use in our internal systems. The raw envelope and segment trail provide a complete record of the EDI transaction, which is useful for auditing and troubleshooting. By outputting both the DTO and the raw data, we provide a balance between usability and auditability.

Self-Service (Phased): Empowering Trading Partners

We're planning a phased rollout of self-service capabilities for trading partners. This will empower them to manage their EDI connections more effectively.

Phase 1: Partner profiles in admin UI, upload sample 204, validate and preview mapping, download error report. In the first phase, we'll provide a user interface where partners can create and manage their profiles. This includes setting their EDI version, delimiters, and other configuration settings. Partners will also be able to upload sample 204 messages and validate them against their profile. The system will provide a preview of the mapping, so partners can see how the EDI data will be transformed into our Shipment model. Partners will also be able to download an error report, which lists any validation errors and helps them troubleshoot their EDI messages.
Phase 2: Visual mapper (drag segment → Shipment field), test harness, code table manager. In the second phase, we'll introduce a visual mapper that allows partners to map EDI segments to Shipment fields using a drag-and-drop interface. This will make it easier for partners to configure the mapping without having to write code or edit configuration files. We'll also provide a test harness, which allows partners to test their mappings and validation rules. Finally, we'll implement a code table manager, which allows partners to view and customize the code tables used for validation.
Phase 3: End-to-end monitoring, partner-specific dashboards, retry/replay, webhooks. In the third phase, we'll provide end-to-end monitoring of EDI transactions. This will allow partners to track the status of their messages and identify any issues. We'll also create partner-specific dashboards, which provide a customized view of their EDI activity. Finally, we'll implement retry/replay functionality, which allows partners to resubmit failed messages, and webhooks, which allow partners to receive notifications about EDI events.

Go Package Layout: Organizing the Code

We're structuring the Go packages to reflect the modular architecture of the system. This includes packages for specifications, transaction handling, mapping, validation, partner profiles, CLI, and test data.

internal/specs/x12/004010/: This package contains the 204 schema, including segment usage, loops, and rules. The schema defines the structure of the 204 transaction and the rules for validating the data. This package provides a central location for the EDI specification, making it easier to maintain and update.
internal/tx/tx204/: This package contains the typed 204 model and builder. The typed model is a Go data structure that represents the 204 transaction. The builder is a tool for constructing instances of the typed model from the parsed EDI data. This package provides a high-level interface for working with 204 data.
internal/mapping/: This package contains the mapping engine and transforms. The mapping engine is responsible for transforming EDI data into our internal Shipment model. Transforms are functions that modify the data during the mapping process. This package provides the core functionality for mapping EDI data.
internal/validation/: This package contains the rules engine and code sets. The rules engine is responsible for validating the EDI data against the EDI standard and our business rules. Code sets define the valid values for certain data elements. This package provides the core functionality for validating EDI data.
partners/: This package contains partner profiles and config loading. Partner profiles store configuration information for each trading partner, such as their EDI version and delimiters. Config loading is the process of loading these profiles into the system. This package provides the infrastructure for managing partner-specific configuration.
cmd/edi-cli/: This package contains the CLI for parsing, validating, mapping, and printing JSON. The CLI is a command-line interface that allows developers to interact with the EDI system. This package provides a tool for testing and debugging the EDI system.
testdata/204/: This package contains sample EDI files and golden JSON. Sample EDI files are used for testing the parser and validator. Golden JSON files are the expected output of the mapping process. This package provides the data needed for testing the EDI system.

Testing & Samples: Ensuring Reliability

We're using a variety of testing techniques to ensure the reliability of the system. This includes golden tests, corpus testing, fuzz tests, and benchmarks.

Golden Tests: We'll use golden tests for EDI → AST → typed 204 → mapped Shipment JSON. Golden tests compare the output of the system against a known correct output. This allows us to verify that the system is working as expected. These tests cover the entire EDI processing pipeline, from parsing the EDI message to mapping the data to our Shipment model.
Corpus: We'll use a corpus of valid/invalid 204s across common partners. A corpus is a collection of sample EDI messages. By testing the system against a corpus of messages, we can ensure that it can handle a wide range of real-world scenarios. The corpus includes both valid and invalid messages, so we can test the system's error handling capabilities.
Fuzz Tests: We'll use fuzz tests for tokenization and delimiter detection. Fuzz testing is a technique for finding bugs by feeding the system with random input. This allows us to uncover unexpected errors and vulnerabilities. We'll focus on fuzz testing the tokenization and delimiter detection components, as these are critical for parsing EDI messages correctly.
Benchmarks: We'll use benchmarks for large files and memory profile. Benchmarks measure the performance of the system. This allows us to identify performance bottlenecks and optimize the system for speed and efficiency. We'll focus on benchmarking the system's ability to handle large EDI files and its memory usage.

Observability & Ops: Monitoring the System

We're implementing observability and operational features to monitor the system's health and performance. This includes structured logs, metrics, dead-letter queues, and idempotency.

Structured Logs: We'll use structured logs for partner, control numbers, counts, and errors. Structured logs are logs that are in a machine-readable format, such as JSON. This makes it easier to analyze the logs and identify issues. We'll log information about the partner, control numbers, segment counts, and any errors that occur. This will give us valuable insights into the system's behavior.
Metrics: We'll track metrics such as parse time, error rates, segment counts, and partner SLOs. Metrics are quantitative measurements of the system's performance. This allows us to track trends and identify potential problems. We'll track metrics such as the time it takes to parse EDI messages, the error rates, the number of segments processed, and the service level objectives (SLOs) for each partner.
Dead-Letter Queue: We'll implement a dead-letter queue for rejects and a roadmap for auto-acks (997/999). A dead-letter queue is a queue that holds messages that could not be processed. This allows us to investigate and resolve the issues that caused the messages to fail. We'll also implement auto-acks (997/999), which are functional acknowledgments that confirm the receipt of an EDI message.
Idempotency: We'll ensure idempotency based on control numbers and reference keys. Idempotency means that processing the same message multiple times has the same effect as processing it once. This is important for preventing duplicate shipments and other data integrity issues. We'll use control numbers and reference keys to ensure that messages are processed idempotently.

Security: Protecting the System

Security is a top priority. We're implementing measures to protect the system from input size limits, sanitization of logs, and config change audits.

Input Size Limits: We'll enforce input size limits and use a streaming parser to avoid memory spikes. This prevents attackers from overwhelming the system with large EDI files. A streaming parser processes the EDI message in chunks, rather than loading the entire file into memory. This reduces the memory footprint of the system and improves its scalability.
Sanitization of Logs: We'll sanitize logs and handle PII (Personally Identifiable Information) for addresses and contacts. This prevents sensitive information from being exposed in the logs. Sanitization involves removing or masking PII, such as names, addresses, and contact information. We'll also implement measures to protect PII in our databases and other systems.
Config Change Audit: We'll implement config change audit and versioning. This allows us to track changes to the system's configuration and revert to previous versions if necessary. This helps us to maintain the security and stability of the system.

Milestones: Tracking Progress

We've defined milestones to track our progress and ensure we're on track to deliver the system on time.

M1: Parser core + envelopes + 204 typed model (dry-run JSON). This milestone focuses on the core parsing functionality and the 204 typed model. The output will be a dry-run JSON representation of the parsed data.
M2: Mapping engine to Shipment + validations + CLI. This milestone focuses on mapping the EDI data to our Shipment model, implementing validation rules, and building the CLI.
M3: Partner profiles + sample partners + admin config UI (basic). This milestone focuses on partner-specific configuration and building a basic admin UI.
M4: Monitoring + error dashboards + basic acks (999) if needed. This milestone focuses on monitoring the system and providing error dashboards, as well as implementing basic acknowledgments (999).

Clarifications: Key Questions

Before we dive in, let's clarify a few key questions to ensure we're all on the same page.

Which X12 version(s) to target first? 004010 standard?
Do you need 990 response generation alongside 204 ingestion now, or later?
Must we produce 997/999 functional acknowledgements immediately?
Any specific partners to model first (to shape code tables/mappings)?
Preferred internal shipment fields to lock mapping targets?

If this direction looks right, I’ll scaffold the Go packages and start with the tokenizer and 204 typed model, then wire up a CLI with sample tests.