RPG Business Rule Extraction
The most valuable artifact of any RPG modernization isn't translated code — it's a catalog of business rules captured in language that engineers and business analysts can both understand. The RPG is the implementation; the rules are the actual asset.
This template extracts those rules so they survive the migration regardless of whether you replatform, refactor, rewrite, or replace.
When to use
- You're migrating an RPG program (any strategy: replatform to PowerVS, refactor with Profound/X-Analysis, rewrite to Java/.NET, replace with COTS)
- The original developers / analysts are unavailable or aging out
- You need rules in a form business stakeholders can validate
- You're building a behavior parity test suite — these rules become the test cases
Why RPG extraction is different from COBOL
RPG has its own challenges:
- Indicators (
*IN01through*IN99,*INLR,*INH1-*INH9) — boolean flags used to drive program flow. Often used as poor man's variables. - The RPG cycle (in RPG II/III) — implicit read/process/write loop driven by file specs. Programs without explicit
READstatements rely on this. Almost incomprehensible to modern engineers without context. - Fixed-format positional — column 6, 7, 18, 26, etc. each have specific meanings. Hard to read without a reference.
- Factor 1 / Operation / Factor 2 / Result — older RPG arithmetic looks like reverse-Polish notation.
- Subroutines via EXSR — global state shared across all subroutines (unlike COBOL paragraphs which can have local data).
- Globals everywhere — every variable in older RPG is global.
A naive "translate this RPG to Java" prompt produces garbage because the model doesn't know about indicators or the cycle. This template forces explicit handling.
Prompt
You are a senior systems analyst with deep RPG and IBM i expertise. You
extract business rules from RPG programs into structured catalogs that
business analysts can validate.
## Input
**RPG program source:**
```rpg
{{rpg_program_source}}
```
**RPG dialect:** {{rpg_dialect}}
**Stated purpose:** {{program_purpose}}
**File definitions / DDS:** {{file_definitions}}
**Called programs:** {{called_programs}}
## Output
A Markdown document organized as follows:
### 1. Program identification
- Program name (from H-spec or program-name comment)
- RPG dialect confirmed (II / III / IV fixed / IV free / SQLRPGLE)
- Program type (interactive / batch / submitted / service program / module)
- Approximate line count
- Cycle-based or non-cycle (look for /FREE, MAIN procedure, or absence of primary file)
- Date of last modification (from change history comments)
- Author / shop (from comments)
### 2. Inputs and outputs
**Files used:**
For each file declared in F-specs:
- File name (and external description if used)
- File type (database, display, printer)
- Usage (input, output, update, combined)
- Access path (sequential, keyed)
- Record format(s) used
- Approximate record count if known
**Display files (interactive programs):**
- Format names referenced (EXFMT, READ, WRITE)
- Function keys handled
- Subfile usage (SFL, SFLCTL)
- Indicators set by display
**Inputs (parameters):**
- Entry parameters (PLIST or *ENTRY)
- Each parameter: name, type, length, direction (input/output/both)
**Outputs:**
- Files written / updated
- Parameters returned
- Database commits / journals
- Print files
- Messages sent (SNDMSG, SNDPGMMSG)
**Side effects:**
- Database updates (UPDATE, WRITE, DELETE on data files)
- Calls to other programs (CALL, CALLP)
- Messages to message queues
- Data area locks/updates (`*LDA`, named data areas)
- Job logs (DSPLY)
### 3. Business purpose
In 2-3 paragraphs, plain language: what does this program DO for the business?
Translate from RPG-speak to business-speak:
- "Reads CUSTMAST keyed by customer number, calculates available credit, writes ORDHDR"
becomes
- "Validates a customer's credit availability before accepting an order, writing the approved order header to the orders database"
If you can infer business context (manufacturing operations, ERP transactions, distribution), say so explicitly.
### 4. Indicator analysis
This is RPG-specific and critical. For each indicator used:
| Indicator | Purpose | Set by | Used by | Migration concern |
|-----------|---------|--------|---------|-------------------|
| *IN01 | "Record found" flag | CHAIN/READ result | IF *IN01 = '1' | becomes boolean isFound |
| *IN10 | "End of file" | READ at end | DOW *IN10 = '0' | becomes while(!eof) |
| *IN50 | "Display in error" | Validation routine | OVERLAY on display | becomes form-level error state |
| *INH1 | F1 pressed (help) | Display file | CASE statement | becomes onClick handler |
| *INLR | Last record (program end) | Set by program at exit | RPG cycle terminator | becomes return statement |
For each indicator, document:
- What boolean/state it represents in business terms
- Where it gets set
- Where it gets read
- What its modern equivalent would be (boolean variable, exception, return value, event)
This is where RPG migrations often fail: indicators get translated mechanically as boolean variables, when they should become structured state.
### 5. Subroutine catalog
For each subroutine (BEGSR / ENDSR or PROC):
```markdown
## SUBROUTINE: SUB-NAME
**Purpose:** [1-2 sentences]
**Lines:** [approximate range]
**Called by:** [where in code, or "RPG cycle / not called explicitly"]
**Inputs (used global vars):** [list]
**Outputs (modified global vars):** [list]
**Files affected:** [list]
**Indicators set:** [list]
**Indicators read:** [list]
### Logic summary
[3-5 bullet points of what it does]
### Modern equivalent
[How this would be structured in Java/.NET — method? class? service?]
```
### 6. Business rules catalog
For each business rule, document:
```markdown
## RULE-NNN: [Short name in business language]
**Plain-language statement:**
[1-2 sentences a business analyst can validate]
**Source location:**
[Subroutine name + line range, or fixed-format line numbers]
**RPG implementation:**
```rpg
[The actual RPG code that implements the rule]
```
**Inputs the rule depends on:**
- Field name 1 (from file X, type, length)
- Field name 2 (program variable, type)
- Indicator state (which indicators must be on/off)
**Output / effect:**
- Field set, file updated, indicator set, branch taken
**Edge cases / boundaries:**
- What happens at zero values
- What happens at maximum field values (don't overflow)
- What happens with blank inputs
- Date boundary cases (if using date fields or 6/8-digit date packs)
- Negative number handling
- Sign convention (packed decimal sign nibble)
- For older RPG: cycle iteration behavior
**Confidence:**
- ✅ HIGH: clearly stated in code, comments confirm
- ⚠️ MEDIUM: inferred from code, no confirming comments
- ❓ LOW: code suggests but logic is convoluted; needs SME validation
**Tags:**
[business-domain] [calculation-type] [validation] [pricing] etc.
```
### 7. The cycle (for cycle-based RPG II/III)
If this is a cycle-based program (no /FREE, no MAIN procedure):
Document explicitly:
- **Primary file** that drives the cycle
- **Matching record level indicators** (M1-M9) and their meaning
- **Level break indicators** (L1-L9) and what they trigger
- **Detail-time vs total-time calculations** (lines marked D vs T)
- **First-page / last-page indicators** (1P, LR)
The cycle is implicit; document what the cycle is doing in plain terms:
> "Read each customer record. If customer number changes (L1 break), print
> customer total and reset accumulator. After last record (LR), print
> grand total."
This is critical — engineers translating cycle-based RPG without
understanding the cycle produce code that doesn't work.
### 8. Decision tables
For programs with complex IF chains or CASE statements, extract decision tables:
| Customer Type | Order Amount | Credit Status | Action |
|---------------|--------------|---------------|--------|
| 'A' | > 5000 | OK | Approve, *IN50 OFF |
| 'A' | > 5000 | HOLD | Reject, *IN50 ON, msg "Credit hold" |
| 'A' | <= 5000 | * | Approve, *IN50 OFF |
| 'B' | > 1000 | OK | Approve manager review, *IN51 ON |
| 'B' | * | * | Approve, *IN50 OFF |
Decision tables are easier for analysts to validate than nested RPG logic.
### 9. Calculation formulas
For arithmetic-heavy programs, extract formulas:
```
DISCOUNT_AMT = ORDER_AMT * DISCOUNT_PCT / 100
where:
- ORDER_AMT = sum of (LINE_QTY * UNIT_PRICE) for all order lines
- DISCOUNT_PCT = lookup from CUSTMAST.CUSTDPC for current customer
- Result truncated to 2 decimal places (RPG H-spec: HALF ADJUST not specified)
```
Document precision and rounding:
- RPG `H-spec` setting (HALFADJ, ROUND DOWN)
- Where rounding happens (each step or final)
- Any explicit MULT or DIV operations with `(H)` half-adjust
- Sign handling for negative values
Financial rounding bugs are the most embarrassing post-migration regressions. Capture exactly how the legacy rounds.
### 10. Database access patterns
For each file accessed, document:
| File | Operation | Access path | Indicators | Notes |
|------|-----------|-------------|------------|-------|
| CUSTMAST | CHAIN by CUSTNO | keyed | *IN01 (found), *IN02 (error) | Lookup |
| ORDHDR | WRITE | sequential add | none | Append only |
| ORDDTL | UPDATE by ORDNO+LINNO | keyed | *IN03 (error) | Modifies status |
| INVMAST | READ + UPDATE chain | keyed by ITMNO | *IN04 (locked) | Allocates inventory |
Distinguish:
- **Pure reads** (CHAIN, READ, READE) — easy to migrate
- **Updates** (UPDATE) — need transaction semantics in target
- **Adds** (WRITE) — sequence/identity column considerations
- **Deletes** (DELETE) — soft vs hard delete in target
- **Locking patterns** (READE with update intent, ALLOC) — concurrency model
### 11. SQLRPGLE specifics
If the program is SQLRPGLE (embedded SQL):
For each `EXEC SQL` block:
- Operation (SELECT INTO, UPDATE, INSERT, DELETE, FETCH)
- Tables / views referenced
- Host variables (program fields used in SQL)
- SQLCODE handling
- Cursor behavior (if applicable)
These are easier to migrate than native I/O — embedded SQL translates
directly to JDBC / Entity Framework / Dapper.
### 12. CL program calls
If this program is called from CL programs, document:
- Calling CL programs (by name)
- Parameters passed in
- CL commands that prepare environment (OVRDBF, OPNQRYF) before this call
- This affects how the program behaves; replicating in modern requires understanding the setup
### 13. Display file (5250 / green-screen) considerations
For interactive programs using display files:
- **Subfile usage:** load-all vs load-as-needed; record count strategy
- **Function keys handled:** F1-F24 mapping
- **Field-level help (HLP keyword):** how help is structured
- **Validation:** check digits, range checks, lookups via CHKMSGID
- **Conditioning indicators:** when fields show/hide, become input/output, change color
- **Window groups, message subfiles:** modal patterns
Modern equivalents are not obvious. A subfile becomes a paginated table.
Conditioning indicators become reactive UI state. Function keys become
keyboard shortcuts or buttons. Document this mapping for the rewrite.
### 14. Quirks and tribal knowledge
Things you can infer that aren't documented:
- "The check `IF AMOUNT > 99999` suggests originally amounts were stored
as PIC 5 0 and someone added a higher cap later"
- "The hard-coded comparison `IF YEAR = '99'` suggests Y2K hack — verify
what's intended"
- "The MOVEL '*BLANKS' before assignment suggests trailing-character bug
someone worked around"
- "Indicator *IN72 is set but never tested in this program — likely tested
by a calling program or used to be"
These are bugs-as-features that migrations get wrong. Capture them.
### 15. Migration-relevant observations
Specific to migrating this program:
- **Cyclomatic complexity:** simple/medium/complex
- **State management:** does it carry state across calls? (matters for
stateless service rewrites)
- **Cycle-based vs linear:** cycle programs need restructure for non-RPG targets
- **Indicators-as-variables:** how much program logic depends on indicator state
- **Global variables:** how much of the program assumes shared variables
- **5250 UI logic:** if interactive, how much logic is tangled with display flow
- **Concurrency:** does it assume exclusive file access?
- **Restart/recovery:** how is it restarted on failure (CL retry logic, manual)
### 16. Open questions
What you can't determine from the code:
- "Field FILLER15 in CUSTMAST DDS — purpose unclear; preserved but needs
SME confirmation"
- "Indicator *IN72 is set on line 145 but never tested; called program?
legacy debug?"
- "PARM2 is sometimes blank, sometimes contains a value; meaning of blank
case unclear"
Each open question:
- The question
- Where it surfaces
- Who could answer (SME, CL caller, RPG specialist)
- Default assumption
- Risk of wrong default
## Quality bar
- Every business rule is **standalone validatable** by a business analyst
- Every rule has **specific source location** (line number, subroutine, fixed-format spec)
- Every rule has a **confidence level**
- Indicators get explicit treatment (this is critical for RPG)
- Cycle behavior is explicit if applicable
- Edge cases are explicit, not implied
- Open questions are listed honestly
## Style
- Plain English in rule statements; technical detail in RPG excerpts
- Acknowledge RPG-isms (don't pretend cycle-based RPG is normal control flow)
- Specific source locations
- Honest about uncertainty
- Capture WHY when inferable, not just WHATTips
- One program at a time. Even short RPG programs (300 lines) often have 30+ business rules and use 20+ indicators. Don't try to extract from a service program with 10 procedures in one prompt.
- Pair with a DDS dump for files used. Without DDS, field meanings aren't always clear. Pull the DDS source separately and feed it as
file_definitions. - Run on representative programs first. If you have 500 RPG programs, do the 10 most critical first. Establish patterns. The other 490 will be similar but not identical.
- Indicators deserve special attention. Many RPG bugs survive migration because indicators were translated mechanically. The indicator analysis section is the most important part for an honest rewrite.
- For cycle-based programs, get an SME involved. RPG II/III cycle is genuinely confusing; SME validation of the cycle interpretation is non-negotiable.
- Pair with business analysts. Extracted rules need validation. Output is a starting point, not a final spec.
Common mistakes to avoid
- Translating RPG to pseudocode and stopping. That's not a business rule. Push to plain language a non-RPG-engineer can validate.
- Ignoring indicators. They're not just booleans; they encode state machines and event handling.
- Skipping the cycle for cycle-based RPG. The cycle IS the program for those.
- Treating SQLRPGLE the same as native I/O. SQL parts migrate easily; native I/O parts don't.
- Underestimating display-file logic. Conditioning indicators and field-level help often hold significant business rules.
- Inventing context to fill gaps. If you don't know, say so. Open questions are honest output.
- Treating extraction as one-shot. Iterative refinement is normal; budget multiple passes.
What this output enables
- Business analyst review and sign-off on rules before rewrite
- Behavior parity test cases for the migrated system
- User stories for the rewrite (each rule → one or more stories)
- Documentation that survives the RPG itself
- Decision about which rules to keep vs eliminate during migration (modernization is the chance to fix accumulated cruft, with explicit decisions not silent drift)