-
Notifications
You must be signed in to change notification settings - Fork 2
/
Copy pathNEWS
121 lines (102 loc) · 5.2 KB
/
NEWS
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
# Change log
## 3.1.0
- Fixes
- `naaccr_encode` now outputs the correct date time format for partial
date time values. Previously, it simply copied whatever format the original
string used.
## 3.0.0
- !!Breaking change!!
- `naaccr_endcode` no longer pads values with blanks to the full width of
their fields.
If the standard padding character is not a blank (e.g., `"0"`), then
the values will still be padded.
For example, `naaccr_encode("New York", field = "addrAtDxCity")` returns
`"New York"`, and `naaccr_encode(1, field = "ageAtDiagnosis")` returns
`"001"`.
- `naaccr_encode` will still set overly-wide values to blanks with a warning.
It is a helper function for writing records to a file, and even XML file
readers may break if values are wider than expected.
- New things
- NAACCR format versions 24 and 25 have been added.
- `naaccr_datetime` now accepts values in ISO-8501 format (with timezones!).
- New function `parse_geocoding_quality_codes` to break the dense
"geocodingQualityCodeDetail" field into its 14 data components.
- Fixes
- Fixed factor label for `serumBeta2MicroglobulinPretxLvl` value of `"0"`.
## 2.0.2
- Fix a bug with `read_naaccr_xml_plain` when a `record_format` is provided
## 2.0.0
- New functions
- `read_naaccr_xml_plain` and `read_naaccr_xml` for reading files in
NAACCR's XML formats
- New data
- `naaccr_formats`, a named list containing `record_format` objects for
the official NAACCR formats
- `naaccr_format_21`, `naaccr_format_22`, and `naaccr_format_23` for the
NAACCR XML formats, versions 21, 22 and 23
- Possible breaking changes
- New columns in `record_format`
- `"parent"`: the field's parent XML node
- `"width"`: field's length in characters
- `"cleaner"`: function used to clean/standardize data when reading
- `"unknown_finder"`: function to replace values with `NA` when reading
- Deprecated columns in `record_format`
- `"end_col"`: not used for XML files, and inferred from `"start_col"`
and `"width"` for fixed-width files
- Package data now compressed using the `"xz"` method, which requires
R version 2 or higher
## 1.0.0
- Added `field_levels` and `field_levels_all` lists, which make it easy to see
the possible values for factor fields.
- All factor and sentinel levels are now easy to type on a standard U.S.
QWERTY keyboard.
- Repeatedly applying `naaccr_factor` or `naaccr_sentinel`
(e.g., `naaccr_factor(naaccr_factor(...))`) is now a no-op.
- Removed AJCC-copyright material.
- Functions are safer: more argument checking, warnings, and option-robustness.
- Fixed bugs with handling custom formats and columns not in the format.
- Fixed bug with formats that included fields not found in text files
## 0.5.0
- Processed date and time fields now store their original text values in the
`"original"` attribute. Partial dates and times are still useful.
- New functions
- `unknown_to_na`: Convert levels of NAACCR factor to `NA` if they mean
the value is unknown.
- `naaccr_encode`: Convert processed values back to their NAACCR text
codes.
- `as.record_format`: Create a custom record format from an object
(this was incorrectly an internal function before).
- `naaccr_date`, `naaccr_datetime`: Parse NAACCR-formatted dates and times.
- `read_naaccr_plain`: Read a NAACCR file and divide it into fields, but
don't do any more processing.
- `split_sequence_number`: Divide the different pieces of information in a
tumor sequence number into multiple columns (i.e., order, uniqueness,
and invasiveness).
- Revamped `read_naaccr` and `write_naaccr`
- *Much* faster!
- `read_naaccr` has more parameters to handle how a file is read:
`skip`, `nrows`, and `encoding` like in `read.table`, and `buffersize`
to ease the pain of reading large files.
- `naaccr_record` and its kin now have a `format` parameter to use custom
formats.
- Treat behavior fields as coded factor, as they should've been
- **Potentially breaking changes**
- Cleaning functions (e.g., `clean_count`) no longer use a field name to
retrieve cleaning parameters (e.g., field width).
Those parameters now need to passed directly.
- The `type` and `alignment` columns of `record_format` objects are now
factors.
- The standard NAACCR record format objects (`naaccr_format_12`, ...)
are no longer lazy-loaded. They are now immediately loaded.
- Improved documentation throughout.
## 0.4.0
- Major improvements to handling of field-specific codes
## 0.3.0
### New features
- New function: `write_naaccr` for writing `naaccr_record` data sets to text
files according to one of the NAACCR formats. For now, it writes all unknown
and missing values as blanks, not the standard codes. Still, `write_naaccr`
and `read_naaccr` are inverse functions.
- Fields with country codes are now converted to factors.
### Backend
- Using package `ISOcodes` instead of `maps` for location code tables