RUTGERS MIS Exam2 Answers
RUTGERS MIS Exam2 Answers
RUTGERS MIS Exam2 Answers
Q1
30
30
22.0
23.0
1
6.2
0.21
73.4%
Q2
40
40
35.5
36.0
26
3.6
0.09
88.8%
Q3 Total
30
100
30
99
26.7
84.2
29.0
85.0
7
45
5.0
11.6
0.17
0.12
88.9% 84.2%
This was a long and sometimes challenging test. Overall, however, your performance suffered
only slightly from the previous exam, with the mean and median both dropping by only about
2 points. However, the standard deviation was considerably higher than last time, with the
score distribution somewhat more spread out. Viewed in buckets of 10, the distribution has a
nearly perfect sawtooth shape with its peak in the 90s. There were perfect scores on each
problem, but not by the same student, so there were no perfect 100s. The query portion of the
test was the most difficult, with an average of about 73% (interestingly, the same thing tends to
-- 1 --
happen with Access grids instead of SQL). Overall, performance on the database design and
normalization questions was strong, with average scores of nearly 90%.
-- 2 --
This was a fairly straightforward query, except for the fairly large number of tables and
consequent join conditions needed. Note that earliest dates first means ordering dates from
earlier to later, so we need the default ascending sort.
(b) For each type of parking lot, show the lot type description, the number of parking
lots the college has of that type (labeled NumberOfLots) and the total number of
parking spaces in lots of that type (labeled TotalSpaces). The output should be in
order of the total number of spaces, the highest number of spaces first.
SELECT TypeDescrip, Count(LotNumber) AS NumberOfLots,
Sum(NumberOfSpaces) AS TotalSpaces
FROM
PARKINGLOT, TYPEOFLOT
WHERE PARKINGLOT.LotTypeCode = TYPEOFLOT.LotTypeCode
GROUP BY TYPEOFLOT.LotTypeCode, TypeDescrip
ORDER BY Sum(NumberOfSpaces) DESC;
This was a fairly straightforward aggregation query. Although it is highly unlikely that there will
be two types of lots with the same type description, good practice dictates grouping by the
LotTypeCode field so we are absolutely sure different lot types will not be combined into one
group. By the rules of standard SQL, we must also group by TypeDescrip, since we are
displaying it without aggregation. Note that LotTypeCode must be qualified wherever it appears
-- 3 --
because it occurs in more than one of the FROM tables (although it does not matter which table
we specify, due to the WHERE condition). Note that in the Count( ) function, any non-blank
field could be counted instead of LotNumber. You should not use quotes around the names
declared in an AS modifier. I accepted ORDER BY TotalSpaces DESC for full credit, even
though this construction does not work properly in Access, and I made similar allowances in
parts (c) and (e).
On this problem, some people got confused between the lot description (the LotDescription
attribute of PARKINGLOT) and the lot type description (the LotTypeDescription attribute of
TYPEOFLOT). I deducted only 2 points for this error since the distinction was easy to miss.
However, aggregating on LotDescription would probably yield a separate line of output for
nearly every parking lot, which would not make much sense.
(c) For each officer who issued any parking tickets in 2013, show the first name, last
name, total number of tickets issued in 2013, and the total dollar amount of tickets
issued in 2013. Sort the results by the total dollar value of tickets issued, highest
dollar value first.
SELECT FirstName, LastName,
Count(TicketIDNumber), Sum(FineAmount)
FROM
OFFICER, PARKINGTICKET, VIOLATIONTYPE
WHERE OFFICER.BadgeNumber = PARKINGTICKET.BadgeNumber
AND
PARKINGTICKET.ViolationCode = VIOLATIONTYPE.ViolationCode
AND
DateTimeIssued >= #1/1/2013# AND
DateTimeIssued < #1/1/2014#
GROUP BY OFFICER.BadgeNumber, FirstName, LastName
ORDER BY Sum(FineAmount) DESC;
The problem didnt specify names for the aggregate fields, so you didnt have to include AS
modifiers, but they could be included to make the output look better. Since two different officers
could well have the same name, we group by BadgeNumber, which must be qualified wherever it
appears since it appears in two of the source tables. By standard SQL rules, we must also group
by the other displayed non-aggregated fields, FirstName and LastName. The ordering of the
output should be descending, in order to show the highest dollar amounts first. The choice of
TicketIdNumber in the Count( ) function is not critical; any other guaranteed non-blank field
would also work.
Most people missed a date-related subtlety in this problem. Since the DateTimeIssued field
contains both the date and time of issue, and #12/31/2013# refers to the first second of
December 31st, the condition DateTimeIssued <= #12/31/2013# would in fact miss
virtually all tickets issued on December 31st. To fix this problem, one could specify the condition
DateTimeIssued <= #12/31/2013 11:59:59pm#, but it is more succinct to use
DateTimeIssued < #1/1/2014#. Unfortunately, specifying the condition DateTimeIssued
BETWEEN #1/1/2013# AND #12/31/2013# has the same problem as DateTimeIssued
<= #12/31/2013#.
-- 4 --
(d) For each ticket issued on or after January 1, 2014 that was either in lot 102B or had
the violation code EXPERM (which denotes an expired permit), show the date and
time of issue, the lot number, the violation description, and the license plate number
and state of the violator. Sort the results alphabetically by lot number. Within
parking tickets issued for the same lot number, sort by date and time of issue, earliest
tickets first.
SELECT DateTimeIssued, LotNumber, ViolDescrip,
LicensePlateNumber, LicensePlateState
FROM
PARKINGTICKET, VIOLATIONTYPE
WHERE PARKINGTICKET.ViolationCode = VIOLATIONTYPE.ViolationCode
AND DateTimeIssued >= #1/1/2014#
AND (LotNumber = 102B OR
PARKINGTICKET.ViolationCode = EXPERM)
ORDER BY LotNumber, DateTimeIssued;
Here, we need parentheses around the OR conditions to get the correct order of evaluation of
logical conditions. Both 102B and EXPERM are literal character strings, so they should
appear in quotes (either double or single quotes). Note that while I did not deduct for not
capitalizing EXPERM, capitalization is significant within quoted strings and should thus
exactly match what was specified in the problem. Also, ViolationCode should be qualified
wherever it appears since it is present in two different source tables (but it doesnt matter which
table we specify, due to the WHERE condition).
While it is possible to include the PARKINGLOT table in this query, it is not necessary, because
you only need the LotNumber attribute, which is in PARKINGTICKET. Also, aggregation is not
needed for this query, and there is therefore no need to include a GROUP BY clause. If you were
to use GROUP BY, you would have to group by all displayed attributes.
(e) Show all vehicles, as identified by their combination of license plate number and
license plate state, which had over $200 in parking tickets during the last three
months of 2013. For each such vehicle, show the license plate number, license plate
state, the total dollar value of tickets issued to it during October-December 2013, and
the number of such tickets. Sort the results by total dollar value, highest values first.
SELECT LicensePlateNumber, LicensePlateState,
Sum(FineAmount), Count(TicketIDNumber),
FROM
PARKINGTICKET, VIOLATIONTYPE
WHERE PARKINGTICKET.ViolationCode = VIOLATIONTYPE.ViolationCode
AND
DateTimeIssued >= #10/1/2013# AND
DateTimeIssued < #1/1/2014#
GROUP BY LicensePlateNumber, LicensePlateState
HAVING Sum(FineAmount) > 200
ORDER BY Sum(FineAmount) DESC;
-- 5 --
Again, names werent required for the aggregated columns, but it was fine to include them. The
subtleties of specifying the date range are the same as in part (c); note that I did not deduct
missing this detail if you already had a deduction in part (c). It is important not to group by
anything other than LicensePlateNumber and LicensePlateState, or you will get separate lines of
output for most or all individual tickets. The condition about having at least $200 worth of tickets
has to go in the HAVING clause and not WHERE, because this information is only known after
aggregation. Specifying WHERE FineAmount > 200 means something completely different,
namely only adding up individual tickets that were each over $200. Note that you should not
include a $ sign before 200, since doing so will generate a SQL syntax error (currency fields are
treated just like number fields); however, I did not deduct for this error. Some people
concatenated the character strings LicensePlateNumber and LicensePlateState by using syntax
like LicensePlateNumber + LicensePlateState. This happens to work in Microsoft dialects
of SQL, but it does not work in most other vendors dialects. Grouping by LicensePlateNumber
and LicensePlateState groups by all combinations of their values, so there is no need to explicitly
concatenate these character strings.
My basic grading rule for this question was one point per error up to a maximum of 6 points
deduction for each query. So, if you made more than six errors on a query, you received no
credit, even if some elements were correct. A few other notes:
The maximum deduction for omitting the WHERE conditions needed to join the source
tables of a query was 2 points, even if you omitted all three conditions in part (a).
The deduction for completely omitting GROUP BY (a surprisingly common error) was 2
points.
The maximum deduction for omitting attributes from the SELECT clause was 2 points,
even if you forgot more than two fields.
As mentioned above, the deduction for getting confused between the lot description and
the lot type description in part (b) was 2 points.
I deducted 1 point for each of the following minor errors, but only for the first occurrence. If
you made the same error again, I did not deduct.
Putting quotes around the alias declare in an AS modifier
Forgetting to qualify an attribute name if it is present in more than one table used in a
query
Using Count( ) where you should have used Sum( )
Using Sum( ) where you should have used Count( )
Not getting the boundaries of a time period exactly right, as discussed above for parts (c)
and (e)
Not using quotes around a literal character string like 102B or EXPERM
Not using # marks to quote a date
Putting both quotes and # marks around a date
Forgetting to use an AS modifier when the question asked for an aggregated column to be
given a specific name.
(Q2 on next page)
-- 6 --
-- 7 --
The most common errors on this problem related to maintenance. The most common mistake
was forgetting to put the Quantity (of parts used) attribute anywhere, for which I deducted one
point. Quite a few people lost 5 points by using the wrong kind of relationship between
MAINTENANCE and PARTS, or merging the MAINTENANCE and HAS tables.
Q3. Database Normalization: Hotel Information (30 Points)
I kept this problem straightforward because I was concerned about the time it would take to write
all the SQL queries in Q1 (although I still underestimated it). There was also a typo in the table
giving two different hotels the same ID, although most people did not notice it (I wrote a
correction on the board during the exam). Here is the solution I had in mind:
(see next page)
-- 8 --
CHAIN(ChainCode, ChainName)
FEATURE(FeatureCode, FeatureDescription)
HOTEL(HotelID, ChainCode, HotelName City, State)
ChainCode foreign key to CHAIN
HAS(HotelID, FeatureCode)
HotelID foreign key to HOTEL
FeatureCode foreign key to FEATURE
The HOTEL-FEATURE many-to-many relationship could of course also be depicted as two oneto-many relationships. My grading scheme, out of 30 points, was as follows:
4 points (2 point for the diagram, 2 points for the outline) for each of the three tables
CHAIN, HOTEL, and FEATURE.
4 points for the one-to-many relationship between CHAIN and HOTEL (2 points for the
diagram, 2 points for the outline).
The many-to-many relationship between HOTEL and FEATURE was treated as being
equivalent to one additional table (4 points) and two additional one-to-many relationships
(4 points each), plus 2 bonus points, for a total of 14 points (6 for the diagram and 8 for
the outline)
Quite a few people included an extra table for cities or locations. Such a table is not required to
normalize the given data, because the dataset contains no attributes that depend on the
combination (City, State). I didnt deduct for an unnecessary extra locations table if you
implemented it correctly, since it is possible to anticipate such a table being needed in the future.
I did deduct 2 points for having City determine State: not only is this assumption objectively
false, but it is also contradicted by the data sample, which contained hotels in both Portland, OR
and Portland, ME.
I deducted one point for introducing a zip code field. Normalization does not involve adding
non-synthetic new fields not in the original design (in fact, some experts would not include
adding even synthetic key fields as part of normalization).
Some people used (HotelID, HotelName) as a primary key for HOTEL. I decided not to deduct
for this, because it could have been caused by my typo that gave two hotels the same ID.
-- 9 --
-- 10 --