Over time there have been inconsistencies in coding standards required by major standard-setting organizations concerning the item sets required, the codes and coding instructions employed, and the timing of adoption of new or revised codes that affect the use of data compiled over several years and from multiple sources. These issues are described below. The standards for tumor inclusion, reportability, and multiple primary rules are addressed separately in Chapter III.

The UDSWG will continue to seek consensus on unresolved issues. Before new standards can be agreed upon, all interested parties must be provided sufficient time to study the proposals. Once UDSWG approves new standards, there must be adequate time for implementation. All members are encouraged to present suggestions or comments on proposed changes to the standards to UDSWG. The NAACCR website,, provides the name of the Committee Chair and electronic forms for proposing additions or revisions.

This chapter describes coding issues affecting each of the following types of measures:

The descriptions in this chapter are intended to provide a summary of coding issues. The original manuals should be consulted when a particular data use requires more detail. This chapter does not track changes made in individual codes over time. Some changes are noted in the individual item dictionary descriptions, and further information can be obtained from historic versions of this volume and from the individual standard setters associated with the items.

County--Current [1840] and County at DX [90]

NAACCR has adopted the Federal Information Processing Standards (FIPS) codes for county as the standard in this volume (see Appendix A for codes). However, standards for codes used vary somewhat by standard setter. For cancers diagnosed prior to 2002, the use of FIPS codes was not universally adopted. For this reason, users of data should determine which codes were used for coding County at DX in a particular file, since no field indicating “County at DX Coding System” is included in the NAACCR layout.

Spanish/Hispanic Origin (Hispanic Ethnicity) [190-210]

Although agreement on standard codes for the data item “Spanish/Hispanic Origin [190]” has been reached, substantial variation persists among registries in how Hispanic ethnicity or Spanish/Hispanic Origin is determined. Procedures for determining ethnicity include:

Population-based registries should attempt to categorize their cases using a method that best approximates the method used by the Census Bureau to determine ethnicity in the population denominators. A standard best method has not been determined. At this time, collection of ethnicity data is not a standard applied by the Canadian Cancer Registry or the provincial/territorial registries.

Attempts have been made to evaluate and improve numerator data based on various methodologic approaches to determining Spanish/Hispanic Origin. NAACCR sponsored a symposium in Atlanta, GA, in January 1996 to discuss methodologic issues faced when attempting to measure cancer among Hispanics. A report was prepared and is available on the NAACCR website ( under the heading “Epidemiologic Reports.” In 1999, a research group was formed from representatives of NAACCR to address issues of definition and to produce comparable data for Hispanic ethnicities across the United States. The group, operating under the auspices of the NAACCR Data Evaluation and Publications Committee, led to the creation of the NAACCR Hispanic Identification Algorithm (NHIA), an algorithm that uses a combination of NAACCR variables to directly or indirectly assign ethnicity.

Registries continue to use different methods to code Hispanic ethnicity. Users of the data must be able to determine how Hispanic ethnicity coding was assigned in a particular file. Based on historical and current discussions, NAACCR includes the field Spanish/Hispanic Origin [190] for direct recording of ethnicity from the medical record, as well as fields for Computed Ethnicity [200], Computed Ethnicity Source [210], and NHIA Derived Hispanic Origin [191].

Occupation and Industry [270-330]

Most population-based registries have found the collection of usual occupation and industry data to be difficult and of limited utility, and for many years no consensus on data items and codes for occupation and industry had been achieved. In 1992, the Cancer Registries Amendment Act required central registries funded by NPCR to collect occupation or industry data to the extent available in the medical record.33

Data on usual occupation and industry are unavailable in an unknown, but significant, proportion of medical records. Even when available, the quality of the data in the medical record is generally untested and often limited to less useful information such as “retired.” Concurrently, this information generally is available in text format on death certificates and, in some states, on the associated state mortality data files.

Some state mortality data files also contain the associated occupation and industry codes in addition to the text data. Much work remains to be done to improve the availability and capture of this potentially important information.

NAACCR will continue to discuss the quality and completeness of occupation and industry data and will reconsider the inclusion of occupation and industry in its recommended data sets.

Sequence Number [380 and 560]

As discussed in Chapter III, SEER, NPCR, and CoC have different standards for determining tumors that are reportable and are to be included in the registry. In addition to collecting these required tumors, some registries also collect and assign sequence numbers to other tumors such as cervix carcinoma in situ or PIN III.

Two sequence number data items, one assigned by the reporting facility, Sequence Number--Hospital [560], and one assigned by the central registry, Sequence Number--Central [380], are now in use. The time period of both Sequence Number data items is a person’s lifetime, although with earlier definitions of Sequence Number--Central [380], central registries historically assigned the numbers from the reference date of the registry. When reportability of a particular tumor changes over time, both the type and the timing of tumors may affect the assignment of sequence numbers, so it is possible for two patients having similar cancer histories to be characterized by different sets of sequence numbers.

Numerous operational issues, such as storage of multiple facility-specific sequence numbers, appropriate linkage rules, and feedback of data to hospitals, have arisen because of policy differences from state to state. When attempting to use the Sequence Number--Central to identify individuals who have had only one lifetime cancer, it is important to realize the definitions used to make that determination vary and that sequencing may be handled differently in different systems.


AJCC TNM Stage, SEER EOD, SEER Historic Stage, SEER Summary Stage (1977 and 2000), and Collaborative Staging [759-1070, 1090-1170, 2800-3050]

Historically, four major staging schemes have been widely used in cancer registries in the United States. The schemes--AJCC TNM, SEER Extent of Disease, SEER Historic Stage, and SEER Summary Stage--differ in complexity, purpose, structure, rules, and definitions. AJCC TNM staging provides clinical utility. SEER EOD provides longitudinal stability for epidemiological studies. SEER Historic and SEER Summary Stage provide population surveillance staging capability. Several oncology subspecialties have developed staging systems applying to a limited number of cancer sites.

In January 2004, the Collaborative Staging System was introduced to reduce duplication of effort and provide a common staging schema from which the major staging categories could be electronically derived. All standard setters in the United States required the use of the Collaborative Staging System version 1 for cases diagnosed January 1, 2004- December 31, 2009, but not every standard setter required every data element. In Canada, the Collaborative Staging System version 1 was adopted as the first national, stage data collection standard for cases diagnosed January 1, 2004; as in the U.S., not all data elements are required. Provincial/Territorial registries are gradually moving forward since 2004 to collect stage data on all newly diagnosed cases but have not yet achieved that goal. CS version 2, based on AJCC 7th edition and renamed the Collaborative Stage (CS) Data Collection System, is effective for cases diagnosed January 1, 2010, and later.

The historic schemes were designed for different purposes at different times, and are not easily compared. Conversion among the seven editions of the AJCC TNM Cancer Staging Manual is often not possible. Minor differences exist between the SEER Summary Staging guides of 1977 and 2000. SEER published the Comparative Staging Guide for Cancer6 in 1993 as an attempt to present comprehensive, site-specific comparisons of the AJCC TNM, SEER EOD, and SEER Summary Staging schemes as an aid in data collection and interpretation. This guide covered the major cancer sites of colon and rectum, lung and bronchus, breast, female genital, prostate gland, and urinary bladder. According to the guide:

For these reasons, comparing cancer registry data by stage over time or across registries, or using pooled data collected by different registries applying different staging schema, is problematic.6

For a discussion of staging issues that affect rules for case inclusion and reportability, see Chapter III, especially the paragraphs “In Situ/Invasive” and “Multiple Primary Rules.”

A summary of the major staging/coding schemes is provided below.

Tumor Size Rules [780]

Over the years, some of the rules for describing tumor size changed several times, and discrepancies existed between the CoC and SEER data. With the implementation of the CS coding system in 2004, all the differences between the two groups’ guidelines for tumor size have now been resolved.

The sites for which the tumor size guidelines differed are listed below. Users of registry data must be aware of possible discrepancies in the meaning of the information recorded in this variable before the diagnosis years indicated in parenthesis.


Historically, NPCR has recommended collecting the date and type of first course of definitive treatment when available.29 For the 1996-1997 diagnosis years, NPCR-funded registries were required to collect and process available treatment information using either the (1995 or 1996) SEER Program treatment data set or the (1995 or 1996) CoC treatment data set.

For 1998-2000, NPCR had a similar recommendation. NPCR-funded registries adopted either the SEER 1998 or the CoC 1998 treatment data set, and were encouraged to use the data item “RX Coding System--Current” [1460] to indicate how treatment was coded for a specific record.

Beginning with 2003 diagnoses, the CoC FORDS2 redefined some treatment fields and added others. Some new and redefined data fields along with dates of treatment are required by NPCR. For the 2003 and forward diagnosis years, NPCR requires the collection of first course of treatment data items when available and requires the submission of the NPCR-required surgery data items. NPCR uses the same codes as CoC FORDS, but does not collect all the data fields. See the list of data items (Chapter VIII) that NPCR registries collect.

SEER will use the same codes as the CoC FORDS but may not collect all of the fields. For example, SEER areas will not collect Rad--Treatment Volume. See the list of data items (Chapter VIII) that SEER areas collect and that SEER requires the SEER registries to transmit to NCI. SEER areas will use the fields Rad--Regional RX Modality [1570] and Rad--Boost Rx Modality (3200) from CoC hospitals to complete RX Summ--Radiation [1360].

RX Summ--Rad to CNS [1370]

This item is maintained in the transmission file for use with historic data. CoC discontinued collection of the item for cases diagnosed on or after January 1, 1996, and SEER discontinued collecting it for tumors diagnosed beginning in 1998. Both organizations instructed coders to record radiation to the central nervous system following those dates as radiation. SEER retains the codes for earlier cases and also converts the data into an appropriate radiation field. The item is no longer supported in any form by CoC.

Time Period for First Course of Treatment [1260, 1270, 1500]

SEER and CoC have historically defined first course treatment differently. The differences affect representation of the date first course treatment begins and the instructions for determining what constitutes first course treatment.

The NAACCR record layout provides two data items that indicate the date of the start of the first course of treatment: Date 1st CRS RX CoC [1270] as defined by CoC, and Date Initial RX SEER [1260] as defined by SEER. The difference between these two definitions is that CoC defines the date the physician decides not to treat the patient as the date of initial treatment, while SEER considers such a decision to be no treatment and the date field is left blank, and the corresponding date flag value is ‘11’.

The SEER and CoC definitions of treatment to be included as “first course” have become increasingly congruent, differing now primarily in their “fall-back” recommendations that apply when no treatment plan is recorded, no standard facility practice applies, no protocol applies, no physician is able to provide assistance, and no record of treatment failure or recurrence of disease is available. In that extreme instance, CoC recommends a 4-month cutoff for the beginning of first-course treatment, and SEER applies a 1-year cutoff for completion of first course of therapy.

Users of historical treatment data should be aware that the definitions of “first course” have changed over time and have been disjointed in the past. The applicable coding manuals and standard-setting organizations should be consulted for specifics.

Users of treatment data also should be aware that registries differ in the amount of treatment data collected in terms of the types of treatment included, non-hospital treatment locations surveyed, items covered (see the previous section), and the use of all codes provided for each item. Thus, treatment data are likely to be inconsistent among registries and to have varying levels of completeness, especially for treatment given in physicians’ offices or other non-hospital settings.

Vital Status [1760]

Both SEER and CoC use code 1 in this field to indicate that the patient is alive. However, these programs use codes 4 and 0, respectively, to indicate that the patient is dead. Both programs have long-standing historical reasons to retain their coding. No agreement has been reached on this data item.

Canadian Data

The NAACCR data standards adopted thus far do not adequately deal with data from places outside the United States. Changes have been made to accommodate postal codes, standard abbreviations for provinces/territories, and other fields in the Canadian data set. A CCCR column has been added to the Required Status Table and future versions of this document will review and increasingly incorporate standards established for Canadian cancer registries.