InIn 4078: Statistical Quality Control

Class Project to be developed during the semester

Exploratory Data Analysis and Process Capability



Problem Statement



Assume that you are working as a quality assurance engineer. One of the operations under your supervision involves the filling of 20 oz. cereal boxes for human consumption. You have been collecting data on the cereal content (weight) in the boxes at three filling machines (A, B, and C) during the last 6 weeks, Monday thru Saturday; during each of the three shifts, 10 samples of 7 boxes are randomly selected from the filling operation. These data are stored in the file "CerealData01.txt". The data file is a comma-separated-value format; in this format, values are separated by commas rather than blanks. Missing values are written as two consecutive commas. The file has 6 columns and 11,340 rows; it looks like this:


1,"A",1,"M",1,18.84

1,"A",1,"M",1,19.93

1,"A",1,"M",1,20.36

1,"A",1,"M",1,19.29

1,"A",1,"M",1,20.5

1,"A",1,"M",1,19.02

1,"A",1,"M",1,21.49

2,"A",1,"M",1,18.95

2,"A",1,"M",1,21.64

....

656,"B",2,"T",3,20.76

656,"B",2,"T",3,19.7

656,"B",2,"T",3,19.73

....

1620,"C",6,"S",3,19.87

1620,"C",6,"S",3,20.5

1620,"C",6,"S",3,19.74

1620,"C",6,"S",3,20.13

1620,"C",6,"S",3,20.18


The first column indicates the sample number (it takes values from 1 to 1620); notice that each sample number is repeated 7 times, that is because each sample has 7 observations. The second column indicates the machine (A, B or C) from which the sample was taken; the first 540 samples were taken from machine A, the samples 541 to 1080, from B, and the last 540 samples (1081 to 1620) from C. The third column, indicates the number of the week in which the sample was taken; weeks are numbered from 1 to 6. The fourth column indicates the day of the week in which the sample was taken; M=Monday, T=Tuesday, ..., S=Saturday. The fifth column indicates the shift within each day in which the sample was taken; shift 1 is the shift from 6:00 am to 2:00 pm, shift 2 goes from 2:00 pm to 10:00 pm, and shift 3, from 10:00 pm to 6:00 am; notice that there are 10 samples taken with each shift; assume that these samples are taken every 48 min. starting at 6:00 am: 6:00 am, 6:48 am, 7:36 am, 8:24 am, 9:12 am, ... and so on; remember that at each of these times, 7 boxes are weighted. Finally, the sixth column records the weight (in oz.) of the cereal in the box.