1901 Toronto Area Test Sample User Guide
TATS - 1901 Toronto Area Test Sample User Guide.
Canadian Century Research Infrastructure (CCRI) at York University.
Contents Page
Introduction 2
The Census of 1901 2
Schedule 2 3
Sample Design 3
Toronto City and Greater Toronto Area (GTA), 1901 3
The Sample 5
Large Households, Institutions or Group Quarters 5
Data Entry 6
Local Verification 6
Sample Substitutions 7
Variables in the Database 7
Recommendations for Coding Schemes 9
Electoral Maps for 1901 Toronto Area Districts 14
Introduction
This 1901 Toronto Area Test sample (TATS) is a large sample, representing a full 20 percent of all census-defined dwellings drawn from the original population schedule (schedule 1) of the 1901 census of Canada for Toronto. All individuals recorded in these dwellings are in the sample. The sample N is 52,702 records. TATS was taken as part of the testing of data entry protocols and the training of data entry operators in the course of conducting the Canadian Century Research Infrastructure (CCRI) project at York University. For accounts of the CCRI, and access to the publically-available 1911 national sample of households and related materials, go to http://ccri.library.ualberta.ca/. The design of the Toronto sample was based directly on experience in a prior project, The Canadian Families Project (CFP), conducted at the University of Victoria. See, http://web.uvic.ca/hrd/cfp/data/index.html.
Professor Gordon Darroch, a team leader of the CCRI project, was one of the principal researchers in the CFP. We wish to thank Doug Thompson and Patrick Frisby at the University of Victoria for their assistance in drawing the TATS sample. The TATS was initially under the supervision of Dr. Evelyn Ruppert, the first coordinator of the York CCRI Centre, now Senior Research Fellow, CRESC, The Open University, United Kingdom (2006). Alden Cudanin and Nicola Farnworth of the York CCRI centre were responsible for supervising and coordinating the test sample data entry. Alden Cudanin and Gordon Darroch were responsible for this User Guide and for making the data files available.
The Census of 1901
We recommend that researchers read the account of the conduct of the 1901 census provided in the Users Guide of the Canadian Families Project (see web link above). They can also find the original enumerator’s instructions on that website. The population was to be recorded as of March 31, 1901, and all information was meant to be accurate as of that date (rather than the day when the enumerator visited a dwelling). Where information was to relate to a year or “the census year,” the year was 1st April 1900 to 31 March 1901.
The Canadian census of 1901 was a de jure census. The Census Act did not define that term, but the Instructions to Officers make an attempt: people were to be enumerated, not necessarily where they were actually located on 31 March 1901, but in “their home or usual place of abode.” See also articles 70 through 78 of the Instructions (pp. xx-xxi) and the references to Special Form A on the CFP website. As the CFP documentation indicates, persons temporarily absent, such as a fisher at sea or a logger in a logging camp or a commercial traveller on the road, were to be enumerated in their usual place of residence. In the case of persons away from home, where there was no “fixed period of return,” there should be no Schedule 1 entry. The application of these instructions must have led to some enumeration difficulties and variations in results.
Although this sample was recorded in the service of the larger CCRI project it is unusually large and valuable as a research tool, so we wish to make it publicly available, but we did not have the resources to code the variables. Below we provide an account of the sample, data entry procedures, variables in the database, and a guide to possible coding by users drawing on the CFP coding schemes.
Schedule 2
Users should note that Schedule 2 of the 1901 census has also been preserved and is available in
digital image form. It was intended to be an extension of Schedule 1. It was an enumeration of properties for persons named in Schedule 1 – not an independent enumeration of all properties in the country. We have only entered data from Schedule 1, but schedule 2 data on properties could be added to the file by interested researchers.
Sample Design
The same sampling process used for the Canadian Families Project (CFP) has been adopted for the 1901 Toronto Area sample. The sample points are census-defined dwellings. Whereas the CFP consisted of a 5% sample of each subdivision with in the entire country, the TATS is a 20% sample of dwellings within each subdivision of the Toronto Area (see geographic boundaries below). An indexing process, which entailed counting and documenting the total number of dwellings in subdivisions was undertaken during the creation of the CFP sample and did not have to be repeated for the creation of this Toronto Area sample. Since only selected census districts were being sampled, a system was developed to create an ‘index’ file that consisted of a list of the number of dwellings/sample points by district and subdistrict for the entire area. A SPSS script then selected the stratified random sample of all the numbers from this list creating the 20% 1901 Toronto Area sample. All the information for each dwelling/sample point selected was then entered into the database. Further details on 1901 CFP sampling can found on page 6 of The National Sample of the 1901 Census of Canada User Guide, 2002 http://web.uvic.ca/hrd/cfp/data/index.html.
Toronto City and Greater Toronto Area, 1901
The TATS was designed to represent the City of Toronto and what would be the equivalent of the 1901 Greater Toronto Area. Districts 116, 117, 118 and parts of 129 cover all 6 wards of the city and the remaining sub districts in districts 129 and all of 130 and 131 cover the rest. The following chart identifies the districts and their sub districts sampled within the entire Toronto Area.
-
District
|
Sub District
|
|
|
116 Toronto Centre
|
Toronto City, ward 3 (Part)
|
|
|
117 Toronto East
|
Toronto City, ward 1
|
|
Toronto City, ward 2
|
|
Toronto City, ward 3 (Part)
|
|
|
118 Toronto West
|
Toronto City, ward 3 (Part)
|
|
Toronto City, ward 4
|
|
Toronto City, ward 5
|
|
Toronto City, ward 6
|
|
|
129 York East
|
East Toronto Village
|
|
Markham
|
|
Markham, Village
|
|
North Toronto (Part) Town-Ville
|
|
Scarboro
|
|
Toronto City, ward 1
|
|
Toronto City, ward 2
|
|
Toronto City, ward 3
|
|
Toronto City, ward 4
|
|
York (part)
|
|
|
130 York North
|
Aurora, Town – Ville
|
|
Bradford, Village
|
|
Georgina & Georina Island
|
|
Gwillimbury, East – Est
|
|
Gwillimbury, North - Nord & Snake Island
|
|
Gwillimbury, West - Ouest
|
|
Holland Landing, Village
|
|
King
|
|
Sutton Village
|
|
|
131 York West
|
Etobicoke
|
|
North Toronto (part), Town – Ville
|
|
Richmond Hill, Village
|
|
Toronto Junction, Town - Ville
|
|
Vaughn
|
|
Weston, Village
|
|
Woodbridge, Village
|
|
York (part)
|
Provided at the end of this user guide are the electoral boundaries maps for each district in the 1901 Toronto Area sample. These maps allow users to further examine the census districts and sub-districts for which the 1901 returns were enumerated. The electoral maps were published by the federal government in 1895. Since the 1901 census districts and the 1895 electoral districts have very similar boundaries, the electoral maps provide users with an accurate, detailed description of the Toronto Area districts that were enumerated in 1901.
The Sample
The sample yielded 9,187 dwellings and 52,702 individuals. The Toronto Area population in 1901, according to the published census, was 268,899; there were 46,134 dwellings within the area. Our sample appears to include 19.91% of all dwellings and 19.59% of all individuals counted in the published census. Users need to remember that the sampling unit here is the dwelling, not the individual. However, comparisons with published census data in the following tables suggest that the sample closely represents the distributions of dwellings and population by districts and subdistricts. Further comparisons with published tables of dwelling and population characteristics for the same geographic area can be conducted by researchers.
-
Districts'>Table 1. Dwellings by Districts
|
|
|
|
|
|
|
|
|
District
|
Sampled Dwellings
|
Total Dwellings
|
Sample as % of Total
|
|
116 Toronto Centre
|
970
|
4829
|
20.09
|
|
117 Toronto East
|
1743
|
8586
|
20.30
|
|
118 Toronto West
|
3110
|
15495
|
20.07
|
|
129 York East
|
1640
|
8273
|
19.82
|
|
130 York North
|
781
|
4033
|
19.37
|
|
131 York West
|
943
|
4918
|
19.17
|
|
Totals__9187__46134'>Totals
|
9187
|
46134
|
19.91
|
|
|
|
|
|
|
|
|
|
|
Table 2. Population by District
|
|
|
|
|
|
|
|
|
District
|
Sampled individuals
|
Total Population
|
Sample as % of total
|
|
116 Toronto Centre
|
5675
|
28765
|
19.72
|
|
117 Toronto East
|
9281
|
45621
|
20.34
|
|
118 Toronto West
|
16024
|
81712
|
19.61
|
|
129 York East
|
8083
|
40405
|
19.94
|
|
130 York North
|
3606
|
18778
|
19.21
|
|
131 York West
|
10033
|
53618
|
18.72
|
|
Totals
|
52702
|
268899
|
19.59
|
|
|
|
|
|
-
Large Households, Institutions or Group Quarters
Some dwellings selected in the sample, of course, were unusually large, as in the case of hospitals, orphanages, asylums and other institutions, including large boarding houses. The 1901 census treated these as separate dwellings. For the purposes of our test sample, large dwellings were treated as single sample units. Data entry operators were instructed to enter all persons in any large institutions identified as sample points. As outlined on page 6 of the CFP’s The National Sample of the 1901 Census of Canada User Guide, 2002 entering all persons in large institutions may affect population estimates based on individual records, sacrificing overall sample precision for the records of the complete population residing in sampled large institutions. Unusual sectors of a population are more appropriately treated as separate sampling strata, as was the practice of the larger CCRI project, but this was beyond the purposes of the TATS.
(See CCRI sampling at http://ccri.library.ualberta.ca/).
Data Entry
Data entry for the 1901 Toronto area sample took place over four years during three separate time periods, serving the purposes of data-entry training. The first was undertaken in the spring of 2003, the second briefly in the summer of 2007 and the third in the winter of 2008. Operationally all batches of data entry were carried out somewhat differently, but the data was entered uniformly. Two stages of preparation needed to be completed before any data could be entered. First, the sample point text files provided by the CFP’s sampler needed to be extracted and made accessible for the data entry software. Secondly, the images for the districts needed to be downloaded and made available for use by the data entry operators. All of the images were downloaded from the public websites of the Library and Archives of Canada and Ancestry.com.
Data entry instructions and procedures were followed as specified in the York 1901 Data Entry Manual. Situations where a data-entry operator (DEO) found it difficult to enter information from the schedule1 forms the Data Entry Supervisor was consulted and a solution prescribed.
Data entry assignment sheets for all sub districts were created and used for tracking DEO progress and completed work. Data entry operators recorded the number of people entered for each dwelling, any notes on the entry, additional comments or requests for a second opinion and indicated that all entered information had been verified.
Local Verification
Each DEO reviewed their work by reopening data entry tasks and reading through the transcribed data to ensure the correct information had been entered and that the procedures found in the data entry manual were followed. Data entry operators would review each individual record and their corresponding information row by row and column by column looking for spelling errors, incorrect values and typos. If any questions about or inconsistencies in the data entered arose, the verifying DEO would then refer back to the census schedule corresponding to the dwelling in question and investigate. If corrections or changes to the data needed to be made, the verifying DEO would make the changes completing the dwelling verification process. In cases where DEOs were not present to validate their work, their data entry tasks were verified by another DEO. This verification process occurred after all data had been entered.
Sample Point Substitutions
If a sample point was deemed unusable, DEOs replaced it with the next complete dwelling. Then a dwelling note indicating dwelling numbers current and substituted was included in the Parent Table dwelling notes field. Only in cases where an unusable dwelling was the last dwelling in the subdistrict the previously listed dwelling was taken as the substituted sample point.
Variables in the Database
Parent Table Information
Variable Name Description
Province Province as entered at top of Schedule 1
District District no. From top of Schedule 1
Subdistrict Sub-district letter from top of Schedule 1
Poll Polling subdivision no. From top of Schedule 1
Place City, town, village or Township from top of Schedule 1
EnumeratorLastName Enumerator’s first name from top of Schedule 1
EnumeratorFirstName Enumerator’s last name from top pf schedule 1
NumberInDwelling Count of persons in Dwelling (Column 1 counts)
DwellingNumber Dwelling house no. from column 1, schedule 1
NumberFamiliesHouseholds Count of family/hhds (Column 2 counts) in Dwelling
Institution Name of Institution as given by enumerator
DwellingNote Note on dwelling by operator during data entry (In access not is SPSS)
DataEntryOperator Name of data entry operator who entered information
Child Table Information
Variable Name Description
PageNo Page number from top of schedule 1
LineNo Line number of individual on schedule 1
HHNo Household no. from column 1, schedule 1
LastName Surname of Family/hhd from column 2, schedule 1
FirstName First name(s) and initials from column 3
Sex Sex (f or m) from Column 4
Colour Colour (usually w, b, r or y) from column 5
RelHead Relationship to head of household from column 6
Marstat Marital Status from column 7
Bday Day and month of birth from column 8
YearBrith Year of birth (4 digits) from column 9
Ageyr Age at last birthday from column 10
AgeMo Age in months (if less than 1) from column 10
BPL Country or place of birth from column 11
UR If born in Canada, whether birthplace rural or urban, column 11
ImmYear Year of immigration to Canada from column 12
NATYR Year of naturalization from column 13
RACE Racial or tribal origins from column 14
NATL Nationality from column 15
RELIGION Religion from column 16
OCC Profession, occupation, trade or means of living from column 17
RETIRED R for retired from column 17
OENMEANS Living on own means from column 18
EMPLOYER Employer from column 19
EMPLOYEE Employee from column 20
OWNACCT Working on own account from column 21
TRADE Working at trade in factory or home from column 22
WORKPLC Name of workplace as given by enumerator
MOEMPFAC Months employed at trade in factory from Column 23
MOEMPHOM Months employed at trade in home from column 24
MOEMPOTH Months employed in other than trade in factory or home, column 25
EARNINGS Earnings from occupation or trade from column 26
EARNSPER Period of earnings if not yearly
EXEARN Extra earnings fro other that chief occupation, column 27
MOSCHOOL Months at school in year from column 28
CANREAD Can read from column 29
CANWRITE Can write from column 30
ENGLISH Can speak English from column 31
FRENCH Can speak French from column 32
MTONGUE Mother Tongue from column 33
INFIRM Infirmities from column 34
INDNOTE Note entered by operator during data entry
PROPWNER Property listed on schedule 2 (See page and line number for linkage)
Recommendations on Coding Schemes
The TATS provides data in the form transcribed directly form the census enumerations; the data are uncoded. Researches can, of course, develop or adopt any suitable coding schemes. The detailed and well-tested CFP coding schemes are recommended. The 1901 CFP coded variables are listed below along with the descriptions of the coding schemes used, as outlined in the CFP User Guide. http://web.uvic.ca/hrd/cfp/data/index.html. For the full coding, consult the User’s Guide on this website.
PROV2 Numeric code for province from Province
Thomas Hillman’s Census Returns...1901 links District numbers to provinces and territories. We created this variable from District numbers in order to see if there was a difference between our totals for each Province (as entered from the enumeration form) and our total for provinces as inferred from District numbers. The results were almost identical, suggesting that enumerators almost always knew which province or District they were in. The codes are:
1 British Columbia 4 Nova Scotia 7 Quebec
2 Manitoba 5 Ontario 8 Territories
3 New Brunswick 6 P.E.I. 9 Unorganized
RelHead Relationship to head of household from column 6
RELHEAD2 * Numeric code for relationship to head
This is the numeric code for relationship to head. To allow for comparability with U.S. census samples, we begin the 4-digit IPUMS codes for RELATE (IPUMS 95 version 1.0 User’s Guide). Note that the codes are not sex-specific (son gets the same code as daughter). Unfortunately these codes by themselves lacked the flexibility to accommodate all of the variations in our sample. Thus in the IPUMS system lodgers, boarders, roomers and tenants fall between 1201 and 1207 - and Employees begin at 1210 - leaving no room to add the many variations we find among lodgers and their kin, boarders and their kin ,etc. We have therefore added a fifth digit to the codes. The first four digits allow for comparisons with IPUMS samples. In our sample it is often difficult to distinguish institutional employees from non-institutional employees, since there is no single, clear identification of institutions (and enumerators did not always make an entry in column 7 of Schedule 2 - the name of the institution). Thus some institutional employees may appear in the coding sequence for Domestic employees (as, for instance, in the case of a maid, cook or laundress who happens to be working in a hotel or an asylum). We have a distinct numeric sequence (13261 through 13273) for religious institutions. Our codes for “other relatives” tend to be inclusive: they include wards and foster children and godsons, but not orphans (who appear under Non-related Youth).
Bday Day and month of birth from column 8
BDAY NOTE: * CFP - Information from Column 8 split into two separate variables. Toronto
BMONTH Area 1901 collected together
BPL Country or place of birth from column 11
BPL2 * Numeric code for birthplace
The numeric codes for birthplace are the 5-digit codes from the IPUMS-95 User’s Guide version 1.0, but with a major extension for the 150_ _ sequence to include all of the entries for Canada. In the IPUMS codes Canadian provinces fall between 15011 and 15081. We have revised the 150- sequence to allow room for provinces and all specific place names entered in this field. Often the province of a specific place cannot be determined (these are entered under 159_ _).
RACE Racial or tribal origins from column 14
RACE2 * Numeric code for racial or tribal origin
This field contains the numeric codes for RACE (“racial or tribal origin”). The RACE codes in IPUMS-95 are not applicable. The comparable field in IPUMS is ANCESTR1 - the respondent’s self-reported ancestry or ethnic origin. We apply the 4-digit ANCESTR1codes but a major extension was required to accommodate Canadian aboriginal peoples and “mixed bloods”. These are coded from 92_ _ to 98_ _. The codes are intended to reflect all possible variations among the original entries. The coding scheme allows for comparability with IPUMS samples but has the disadvantage (inherent in the IPUMS codes) that aggregation of certain groups important in the Canadian context will not be easy (English are coded 110, Scots 880, Welsh 970). Note that most francophone Canadians were entered as “French” in the original and hence are coded 260.
NATL Nationality from column 15
NATL2 * Numeric code for nationality
The numeric codes for nationality could not be taken from the IPUMS-95 codes for citizenship (CITIZEN), which are too limited for use here. Instead we apply, virtually intact, the 4-digit codes for ANCESTR1 (as for RACE2 above). Enumerators respected the instructions about Canadian but a few distinct entries for aboriginal nations appear.
RELIGION Religion from column 16
RELIGION2 * Numeric code for religion
U.S. historical censuses did not ask respondents to state their religion, and so we cannot apply IPUMS codes. The numeric coding scheme used here is designed for ease of aggregation, and groups religions by broad categories or families. The grouping is adapted from J. Gordon Melton, The Encyclopedia of American Religions (Wilmington, N.C., 1978), volumes 1 and 2, a source which has the advantage that it pays attention to the historical development of religions in North America and the European roots of many.
Enumerators often did not enter religion for aboriginal peoples (and some of the nonstandard forms in District 206 did not even have a column for religion), or entered merely “pagan.” A study of the religious affiliation of aboriginal peoples would require careful over-sampling of reel 6556.
OCC Profession, occupation, trade or means of living from column 17
OCC1 * Numeric code for Profession, Occupation, trade...
The 5-digit codes applied here are an extension of those in CCDO – Canadian Classification and Dictionary of Occupations (Ministry of Supply and Services, 1989), itself an adaptation of ISCO categories. Since enumerators, following their instructions, often stated both the type of work (labourer, clerk, merchant) as well as the “branch” or sector in which the work was done, a decision had to be made about whether to give priority in coding to the type of work or to the sector. Most coding schemes give priority, of necessity, to the type of work - thus all clerks will appear in the same general category, all agents in another category, managers in another category, whatever sector of the economy they may be in. The present coding scheme follows this precedent for most occupations: thus with agents, book-keepers, cashiers, checkers, clerks, dealers, and merchants priority is given to the job or function rather than the sector. The richness of the occupation information, however, allows some priority to be given to sector. Thus foremen, inspectors, labourers (other than general or unspecified), “makers,” managers, and manufacturers are grouped with their industry or sector, where it is given by the enumerator.
The first 3 digits of the code also allow for fine distinctions by economic sector. Thus, difficult as it sometimes is to make the distinction, we have made 5 general categories for clerks. Enumerators often gave more than one occupation, despite the instruction that “the chief or principal calling is the only one to be recorded.” Thus farmers (711) are distinguished from farmers who were given some other occupation as well as farming (712); and farm employees are a separate category (714). The codes are intended to allow for ease of aggregation into very broad categories, using the first 2 digits:
11 Managerial, administrative, financial management, government and related
21 Scientists, architects, and related professionals
23 Law and social institutions
24 Students
25 Occupations in religion
27 Teaching professions
31 Occupations in medicine and health
33 Occupations in the arts and writing
41 Clerical and bookkeeping occupations
51 Commerce and sales occupations
61 Service occupations
71 Agricultural occupations
73 Occupations in fishing, hunting and trapping
75 Occupations in logging and forestry
77 Occupations in mining and oil and gas production
81 through 88 Occupations in primary and secondary processing, manufacture and construction (construction and related fall between 871 and 881)
91/93 Transportation
95 Others (printing and related is 951; stationary engineers and unspecified firemen 953; telegraph and telephone 955)
99 General labour and unclassifiable (with general labour at 991)
OCC2 * Constructed variable; occupation type
No single set of numeric codes can reflect the full complexity of entries under occupation. Users of census data often require a socio-economic ranking system derived from occupation information. We have not applied a socio-economic ranking here, in part because the census already contains a potentially powerful indicator of the social class of respondents in columns 18 through 21. The limited priority given to sector in the codes for occupation (OCC1), however, risks the loss of important information. OCC1 does not allow one easily to focus on all labourers, or all managers, for instance. To compensate for this loss we apply a simple 2-digit code to flag the presence of specific terms in the occupation information entered by the enumerator. At the very least, these codes will allow users to retrieve certain occupations from the occupational hierarchy that existed in many sectors at the turn of the century. Where the following words do not appear, the field is blank (in SPSS it is “system-missing”).
Manufacturer 01 Employee 81
Proprietor 02 Labourer 82
Owner 03 Worker 83
Employer 04 Man 84
Hand 85
Manager/president 10 Woman/lady 86
Secretary 12 Operative 87
Assistant secretary 13 Servant/domestic 88
Master 14
Partner 15 Assistant 91
Chief/chef 16 Apprentice 90
Captain 17 Boy 92
Helper 93
Superintendent 20 Girl/maid 94
Supervisor 21 Child 95
Inspector 22 Son/daughter 96
Agent 30 Wife 97
Solicitor 31 Woman 98
Assistant agent 32
Foreman/forewoman 40
Overseer 41
Boss 42
MTONGUE Mother Tongue from column 33
MTONGUE2 * Numeric code for mother tongue
We applied the MTONGUE detailed codes from IPUMS-95, but added a series (9200 to 9295) for aboriginal languages in Canada.
INFIRM Infirmities from column 34
INFIRM2 * Numeric code for infirmities
We created our own numeric codes for infirmities, grouped by first digit as follows:
1 blind
2 deaf/deaf and dumb
3 dumb only
4 unsound mind
5 lamed/cripple
6 idiocy
7 unspecified infirmity or invalid
8 old age/palsy/sick
9 other/illegible
ELECTORAL MAPS
Electoral Map 1 – District 116 Toronto Centre
Electoral Map 2 – District 117 Toronto East
Electoral Map 3 – District 118 Toronto West
Electoral Map 4 – District 129 York East
Electoral Map 5 – District 130 York North
Electoral Map 6 – District 131 York West
Toronto 1901 Test Sample User Guide
- -
Do'stlaringiz bilan baham: |