Data Flow Diagrams (DFD): Levels, Symbols, and Practical Examples

Learn how to create data flow diagrams with the right notation. Covers DFD levels 0-2, Yourdon-DeMarco vs Gane-Sarson symbols, and real-world examples.

8 min de leitura

When a system breaks down or a process produces wrong outputs, the first question is usually "where did the data go wrong?" Data flow diagrams (DFDs) answer that question by making data movement explicit — showing exactly where data comes from, where it goes, how it gets transformed, and where it gets stored.

Unlike flowcharts that model control flow (what happens next), DFDs model data flow (what data moves where). That distinction makes them the right tool for systems analysis, software design, and documenting complex data integrations.

What is a data flow diagram?

A data flow diagram is a visual representation of how data moves through a system. It shows:

  • External entities — sources and destinations of data outside the system
  • Processes — transformations that convert input data into output data
  • Data stores — places where data is held at rest
  • Data flows — the movement of data between the above elements

DFDs don't show timing, decision logic, or who performs tasks. They show data movement exclusively. This focused scope makes them useful for understanding and designing systems without getting lost in implementation details.

DFD symbols

Two notations dominate: Yourdon-DeMarco and Gane-Sarson. They represent the same concepts with different shapes.

Yourdon-DeMarco notation

The original structured analysis notation, widely used in software engineering:

Element Symbol Description
Process Circle (bubble) Transforms data; labeled with a verb phrase
Data store Open-ended rectangle Stores data; labeled with a noun
External entity Rectangle Source or destination outside the system
Data flow Arrow with label Named data moving between elements
  ┌──────────┐         ╭──────────╮        ┌═══════════════╗
  │ Customer │───────→ │  Validate│───────→ ║ D1: Orders    ║
  └──────────┘  Order  │  Order   │  Valid  ╠═══════════════╣
                       ╰──────────╯  Order  ║               ║

Gane-Sarson notation

Common in information systems and business analysis:

Element Symbol Description
Process Rectangle with round corners (split header) Labeled with ID and process name
Data store Rectangle open on left side Labeled with D-number and name
External entity Rectangle Source or destination outside the system
Data flow Arrow with label Named data moving between elements
  ┌──────────┐         ┌────────────────┐        ┌═══╦════════════╗
  │ Customer │───────→ │ 1.0            │───────→ ║D1 ║ Orders     ║
  └──────────┘  Order  │ Validate Order │  Valid  ╚═══╩════════════╝
                       └────────────────┘  Order

Which notation to use?

Both convey the same information. Choose based on your context:

  • Yourdon-DeMarco: preferred in software engineering, academia, and when using structured analysis methods
  • Gane-Sarson: common in business information systems and enterprise contexts

Pick one and stay consistent throughout all diagrams for a project.

DFD levels

DFDs are organized hierarchically. Higher-level diagrams provide overview; lower-level diagrams provide detail. Each level decomposes a single process from the level above into its internal sub-processes.

Level 0: Context diagram

The context diagram shows the entire system as a single process, surrounded by external entities. It defines the system boundary and shows what data crosses that boundary.

                   Customer Order
  ┌──────────┐ ─────────────────→ ╭──────────────────────╮
  │ Customer │                    │                      │
  └──────────┘ ←─────────────────  │   Order Management   │
              Order Confirmation   │       System         │
                                   │                      │
  ┌──────────┐ ─────────────────→ ╰──────────────────────╯
  │ Supplier │   Inventory Update           │
  └──────────┘                             ↓
                                   ┌──────────────┐
                                   │ Payment Proc.│
                                   └──────────────┘

A context diagram should fit on one page and show only boundary-crossing data flows — no internal detail. If you find yourself adding internal processes, you're at the wrong level.

Level 1: Overview diagram

The level 1 diagram decomposes the single context process into its major sub-processes. These are the primary functional areas of the system.

  ┌──────────┐         ╭──────────╮        ╭──────────╮
  │ Customer │───────→ │ 1.0      │───────→ │ 2.0      │───┐
  └──────────┘  Order  │ Receive  │Validated│ Process  │   │
                       │  Order   │  Order  │ Payment  │   │
                       ╰──────────╯         ╰──────────╯   │
                            │                              ↓
                            ↓                       ╭──────────╮
                   ┌════════════════╗               │ 3.0      │
                   ║ D1: Orders     ║────────────→  │ Fulfill  │
                   ╚════════════════╝  Order Data   │  Order   │
                                                    ╰──────────╯
                                                         │
                                                         ↓
                                                  ┌──────────┐
                                                  │ Shipping  │
                                                  │ Partner  │
                                                  └──────────┘

Level 1 diagrams typically have 3-7 processes. If you have more, consider grouping into fewer higher-level processes.

Level 2: Detailed diagram

Level 2 decomposes each level 1 process into its sub-steps. Each bubble from level 1 gets its own level 2 diagram.

For example, expanding process 1.0 "Receive Order" from above:

  ┌──────────┐           ╭──────────╮           ╭──────────╮
  │ Customer │──────────→ │ 1.1      │──────────→ │ 1.2      │
  └──────────┘  Raw Order │ Validate │ Valid Data │ Check    │
                          │ Format   │            │ Stock    │
                          ╰──────────╯            ╰──────────╯
                               │                      │    │
                          Invalid                     │  In Stock
                           Order                   Out of │
                               ↓                   Stock  ↓
                          ┌──────────┐              ╭──────────╮
                          │ Customer │              │ 1.3      │
                          └──────────┘              │ Reserve  │
                                                    │ Items    │
                                                    ╰──────────╯
                                                         │
                                                         ↓
                                                ┌════════════╗
                                                ║ D1: Orders ║
                                                ╚════════════╝

How deep should you go?

Stop decomposing when a process is:

  • Simple enough to describe in a few sentences
  • Implemented by a single person or atomic system function
  • Already at the detail level needed for implementation

Most systems need no more than level 2. Level 3 is rare and often indicates the system is too complex or the decomposition isn't well-structured.

DFD vs flowchart

These are frequently confused because both use boxes and arrows. They answer different questions.

Aspect Data Flow Diagram Flowchart
Shows How data moves through a system How control flows through a process
Primary question What data goes where? What happens next?
Time/sequence Not modeled (data transforms only) Central — sequence is the main structure
Decision logic Not represented Explicit (diamond decision nodes)
Who performs Not shown Can show with swimlane format
Data storage Explicit data store symbol Not represented
Best for Systems analysis, data architecture Process documentation, procedure guides

A flowchart shows the steps in a loan approval process. A DFD shows what data the loan application system receives, where it's stored, and what outputs it produces.

Use a DFD when you're analyzing or designing a system. Use a flowchart when you're documenting a procedure.

Step-by-step: creating a DFD

Step 1: Define the system boundary

Identify what is inside your system and what is outside:

  • Inside: processes, data stores, and data flows you control
  • Outside: external entities — customers, partners, external systems

Everything outside the boundary is an external entity. Data crossing the boundary appears as flows in the context diagram.

Step 2: Draw the context diagram (Level 0)

  1. Place the entire system as a single labeled circle in the center
  2. Identify all external entities and place them around the outside
  3. Draw data flows crossing the boundary with descriptive labels
  4. Check completeness: is any data entering or leaving the system unlabeled?

Step 3: Identify major processes (Level 1)

Decompose the system into 3-7 major processes:

  • Name each process with a verb phrase ("Validate Order," "Process Payment")
  • Number them (1.0, 2.0, 3.0)
  • Identify which data flows connect them

Step 4: Add data stores

Identify where data is held between processes:

  • Databases: customer records, order history, inventory
  • Files: log files, configuration
  • External data: data passed to/from external systems (often represented as boundary flows, not internal stores)

Name each data store with a noun and assign a D-number (D1, D2).

Step 5: Connect with data flows

Draw labeled arrows between elements:

  • Flows must have descriptive names ("customer order," "validated record," "invoice")
  • Flows connect processes to processes, processes to data stores, and external entities to processes
  • Data stores do not connect directly to external entities (data must pass through a process)

Step 6: Decompose to Level 2

For each major process in Level 1, draw a separate Level 2 diagram showing its internal sub-processes. The data flows entering and leaving the Level 1 process become the boundary flows of the Level 2 diagram.

Step 7: Verify consistency

Check that:

  • Every data flow entering a process is used by that process
  • Every data flow leaving a process originates from that process
  • Data stores are accessed only through processes (not directly by external entities)
  • Level 1 boundary flows match the context diagram flows

Real-world example: e-commerce order system

Level 0: Context diagram

  ┌──────────┐  Order Request    ╭──────────────────────╮
  │ Customer │ ────────────────→ │                      │
  └──────────┘                  │   E-Commerce          │
       ↑        Order Status    │   Order System        │
       └───────────────────── ─ │                      │
                                ╰──────────────────────╯
  ┌──────────┐  Payment Result          │          ↑
  │ Payment  │ ────────────────→        │          │
  │ Gateway  │ ←────────────────        │    Shipment
  └──────────┘  Charge Request         ↓      Data
                                ┌──────────────┐
                                │  Shipping    │
                                │  Partner     │
                                └──────────────┘

Level 1: Major processes

  Customer ─── Order Request ──→ ╭──────────╮
                                  │ 1.0      │ ──→ D1: Orders
                                  │ Validate  │
                                  │ Order     │
                                  ╰──────────╯
                                       │
                               Validated Order
                                       ↓
  Payment Gateway ←── Charge ── ╭──────────╮ ── Payment ──→ D2: Payments
                    Request     │ 2.0      │    Record
  Payment Gateway ── Result ──→ │ Process  │
                                │ Payment  │
                                ╰──────────╯
                                       │
                              Confirmed Order
                                       ↓
                                 ╭──────────╮
  D1: Orders ── Order Data ────→ │ 3.0      │ ── Ship Request ──→ Shipping
  D3: Products ─ Stock Data ───→ │ Fulfill  │                     Partner
                                 │ Order    │
                                 ╰──────────╯
                                       │
                               Shipment Data
                                       ↓
                                 ╭──────────╮
                                 │ 4.0      │ ── Status Update ──→ Customer
                                 │ Track &  │
                                 │ Notify   │
                                 ╰──────────╯

Level 2: Decomposing process 1.0 (Validate Order)

  Customer ─── Raw Order ──→ ╭──────────╮
                              │ 1.1      │ ── Invalid ──→ Customer (Error)
                              │ Check    │
                              │ Format   │
                              ╰──────────╯
                                   │
                             Formatted Order
                                   ↓
                              ╭──────────╮
  D3: Products ─ Stock Info ─→│ 1.2      │ ── Unavailable ──→ Customer
                              │ Verify   │
                              │ Stock    │
                              ╰──────────╯
                                   │
                             Available Order
                                   ↓
                              ╭──────────╮
  D4: Customers ─ Auth Data ─→│ 1.3      │ ── Unverified ──→ Customer
                              │ Verify   │
                              │ Customer │
                              ╰──────────╯
                                   │
                             Validated Order
                                   ↓
                            D1: Orders (stored)

Common DFD mistakes

Connecting external entities directly to data stores. Data stores are internal; external entities cannot access them directly. All data crossing the system boundary must pass through a process.

Unlabeled data flows. Every arrow must have a descriptive name. "Data" or "Info" are not descriptive. A flow should be named for what the data represents: "customer order," "payment confirmation," "stock level."

Processes without inputs or outputs. A process must have at least one input flow and one output flow. A circle with no incoming data "creates" data from nothing — that's not a process, it's an external entity. A circle with no outgoing data discards everything — model that as a data store write or remove the process.

Mixing control flow with data flow. Decisions, sequence, and control logic don't belong in DFDs. If you find yourself drawing decision diamonds, you're creating a flowchart, not a DFD. DFDs show data movement only.

Too much detail at level 1. Level 1 should have 3-7 major processes. If you're showing 15 processes at level 1, you've skipped the hierarchical decomposition. Group related processes into higher-level bubbles and use level 2 to show detail.

Inconsistent levels. The data flows entering process 2.0 in the level 1 diagram must match the boundary flows in the level 2 diagram for process 2.0. Inconsistency means the diagrams don't represent the same system.

Data stores shared across subsystems without explanation. If multiple processes access the same data store, make sure it's intentional and that the access makes sense. Overusing a single "main database" data store hides important data architecture decisions.

Creating data flow diagrams with Flowova

Mapping data flows manually is tedious — identifying all the boundary flows, numbering processes consistently, and keeping levels synchronized takes significant effort.

Flowova's data flow diagram maker generates DFD structures from plain-language system descriptions. Describe your system's inputs, outputs, and major processes, and get a draft diagram you can refine. This is especially useful for creating context diagrams and level 1 overviews quickly, then drilling into level 2 detail for the processes that need it.

Conclusion

DFDs are the right tool when you need to understand or communicate how data moves through a system — not what happens step by step, but where data originates, what transforms it, where it's stored, and where it ultimately goes.

Start with a context diagram to establish the system boundary. Decompose to level 1 to identify major processes. Go to level 2 only for processes that need further clarification. Keep data flows named specifically, avoid mixing control logic into the diagram, and verify consistency across levels.

Artigos relacionados