# Exercises

## CRUD

### insert() exercises

#### Exercise

* Insert one document to pypeople collection
  * \_id:0, name: Jisu, age: 24
* 결과

```python
{'_id': 0, 'age': 24, 'name': 'Jisu’}
```

* Insert multiple document
  * \_id: 1, name: Deayoung, age: 51
  * \_id: 2, name: Youngjin, age: 24
  * Print all documents
* 결과

```python
 [{'_id': 0, 'age': 24, 'name': 'Jisu’},
  {'_id': 1, 'age': 51, 'name': 'Deayoung’},
  {'_id': 2, 'age': 24, 'name': 'Youngjin'}]
```

### find() Exercises

#### Exercise 1

#### Preliminary

* Import `people.json` into `lab4.people` (i.e. database name is `lab4` and collection name is `people`)

```bash
$ mongoimport -d lab4 -c people people.json
```

#### Questions

* Find the documents whose age under 22.
* Find the documents whose age is between 22 and 26 (inclusive).
* Find the document whose name is “Lee” suppressing the profile.
* Find the documents whose name is not Lee.
* Find the documents whose name is “Lee” or “Park”.
* Find the documents whose name is not “Kim” or “Park” or “Lee”.

### Exercise 2

#### Preliminary

* Import `inventory.json` into lab4.inventory

#### Questions

* Find documents where the *tags* field contains below elements.
  * appliance, school, and book.
  * Hint: recall `$all` operator
* Find all documents but select at most three items in the *tags* field.
  * Hint: recall `$slice` operator
* Find all documents but select items from the second to the fourth in the *tags* field.
  * Hint: recall `$slice` operator with `[skip, limit]`
* Find documents that have brown color in the *qty* field.
  * Hint: recall `$elemMatch` operator

### Exercise 3

#### Preliminary

* Import `store.json` to the store collection

#### Questions

* Find documents where *buyer*'s *name* is "J.S. Lee"
  * Hint: Querying embedded document.
* Print purchased *items* where a *name* in a document is "iPhone Xs"
  * Hint: Querying embedded document, Projection
* Print the buyer’s purchase method where *items* contain the document that price is more than 2000$
  * Hint: < array.field >, Projection
* Find documents that don’t have carrier as key in attribute where the name is "Apple Watch Series 4"
  * Hint: `$elemMatch`, `$exists`

### **Exercise 4**

#### **Preliminary**

* Import `store.json` to the store collection
  * Same with Exercise 3!

#### Questions

* Find documents where buyer's name is "J.S. Lee"
  * Hint: Querying embedded document.&#x20;
* Print purchased items where a name in a document is "iPhone Xs".&#x20;
  * Hint: Querying embedded document, Projection&#x20;
* Print the buyer’s purchase method where items contain the document that price is more than 2000$.&#x20;
  * Hint: `<array.field>`, Projection&#x20;
* Find documents that don’t have carrier as key in attribute where the name is "Apple Watch Series 4"&#x20;
  * Hint: `$elemMatch`, `$exists`

### **Exercise 5**

#### Preliminary

* Import `store.json` to the store collection
  * Same with Exercise 3!

#### Questions

* Get the first three documents and print buyers field only.
* Skip a document and then print two documents
* Count documents in the store collection.

### **Exercise 6**

#### Preliminary

* Using `grade.json` (import it using mongoimport)

#### Questions

* The grade’s document can have among three types:
  * quiz, exam or homework
* Student id (sid) is from 0 to 99
* A document is lost, which contains a quiz score\
  1\. print lost document’s sid.\
  2\. Insert the quiz score (80) to the corresponding sid\
  3\. print the three students that have the highest quiz score.
* A simple example of the dataset looks like ...

```javascript
{"sid": 0, "type": "homework", "score": 69}
{"sid": 0, "type": "quiz" , "score": 29}
{"sid": 0, "type": "exam", "score": 86}
{"sid": 1, "type": "homework", "score": 2}
{"sid": 1, "type": "quiz" , "score": 6}
{"sid": 1, "type": "exam", "score": 67}
{"sid": 2, "type": "homework", "score": 13}
{"sid": 2, "type": "quiz" , "score": 91}
{"sid": 2, "type": "exam", "score": 76}
...
```

### **Exercise 7**

#### Preliminary

* Using `grade.json` (import it using mongoimport)
  * Same with exercise 6

#### Questions

* Write the score searching script
  * Search documents by filtering *type* and *score* fields

#### Results

```
search > homework 13
sid homework
2 13
74 13
```

```
search > quiz 41-49
sid quiz
10 46
17 48
19 44
...
96 49
98 41
```

### update() Exercise

Set a 'location' field to 'Seoul' to a document.

* The document has a 'name‘, 'Deayoung‘.&#x20;

```python
[{'_id': 0, 'age': 24, 'name': 'Jisu'},
 {'_id': 1, 'age': 51, 'location': 'Seoul', 'name': 'De ayoung'},
 {'_id': 2, 'age': 24, 'name': 'Youngjin'}] 
```

* Set a 'location' field to 'Suwon' to documents.
  * The documents have a 'age‘, 24.

```python
[{'_id': 0, 'age': 24, 'location': 'Suwon', 'name': 'Jisu'},
 {'_id': 1, 'age': 51, 'location': 'Seoul', 'name': 'De ayoung'},
 {'_id': 2, 'age': 24, 'location': 'Suwon', 'name': 'Y oungjin'}]
```

### delete() exercise

Delete documents

* * Remove documents that have a 'location‘, 'Suwon‘.
* Print all documents

```
{'_id': 1, 'age': 51, 'location': 'Seoul', 'name': 'Deayoung'}
```

​

### explain() Exercise

* Print a query plan using executionStats corresponding to the query that finds writer, Kim.
* Using blog.json

```python
 "executionStats" : {
…
                "executionTimeMillis" : 0,
                "totalKeysExamined" : 0,
                "totalDocsExamined" : 4,
                "nReturned": 1,
                "executionStages": {
                               "stage": "COLLSCAN",
…
```

## CRUD(Advanced)

* Use a ex1 database and imdb collection.

* We are going to make a movie database like IMDb.

* Each movie has following information.

  * title: String
  * director: String
  * genre: List(String)
  * score: Double

* **Insert a movie**
  * Write a script that inserts movie information to a movie collection.&#x20;
  * Type the movie information as follows:
  * > Movie Title: **Avengers: Infinity War**
    >
    > Director: **Anthony Russo and Joe Russo**
    >
    > Genre: **Action, Adventure, Drama**
    >
    > score: **8.6**
    >
    > \[INFO] Success to insert!
  * Plain: Terminal output&#x20;
  * Bold: User input&#x20;
  * *Hint: use a input() function*

* Confirm that the insertion is operated properly.&#x20;

* Print only movie’s title(s)

```python
[INFO] Movie list
Avengers: Infinity War
```

* Insert three movies more. Use **insert\_many**
  * > Movie Title: Bohemian Rhapsody
    >
    > Director: Bryan Singer
    >
    > Genre: Biography, Drama, Music
    >
    > score: 8.1
    >
    > Quit (Y/n)? n
  * > Movie Title: Happy Death Day 2U&#x20;
    >
    > Director: Christopher Landon
    >
    > Genre:  Drama, Horror, Mystery
    >
    > score: 6.6
    >
    > Quit (Y/n)? n
  * > Movie Title: Spider-Man: Into the Spider-Verse&#x20;
    >
    > Director:  Bob Persichetti, Peter Ramsey
    >
    > Genre:  Animation, Action, Adventure
    >
    > score: 8.6
    >
    > Quit (Y/n)? Y
    >
    > \[INFO] Success to insert!
* Confirm that the insertion is operated properly.
* Print only movie’s title(s)

```python
[INFO] Movie list
Avengers: Infinity War
Bohemian Rhapsody
Happy Death Day 2U
Spider-Man: Into the Spider-Verse 
```

* **Update** movie information
  * Querying movie title and update the information
  * > \[INFO] Enter the information which you want to edit
    >
    > Movie Title: **Avengers: Infinity War**
    >
    > Director: *Anthony Russo and Joe Russo*
    >
    > Genre: *Action, Adventure, Drama*
    >
    > score: *8.6*
    >
    > \[INFO] Success to update!
  * Plain: Terminal output
  * Bold: query
  * Italic: update
* Print all documents

```python
[INFO] Movie list
[{'_id': ObjectId('5c76acb565d1016cd33d4a5d'),
  'director': 'Anthony Russo and Joe Russo',
  'genre': ['Action', ' Adventure', ' Drama'],
  'score': 8.6,
  'title': 'Avengers: Infinity War'},
 {'_id': ObjectId('5c76ad0d65d1016cd33d4a5e'),
  'director': 'Bryan Singer',
  'genre': ['Biography', ' Drama', ' Music '],
  'score': 8.1,
  'title': 'Bohemian Rhapsody'},
 {'_id': ObjectId('5c76ad0d65d1016cd33d4a5f'),
  'director': 'Christopher Landon',
  'genre': ['Drama', ' Horror', ' Mystery'],
  'score': 6.6,
  'title': 'Happy Death Day 2U'},
 {'_id': ObjectId('5c76ad0d65d1016cd33d4a60'),
  'director': 'Bob Persichetti, Peter Ramsey',
  'genre': ['Animation', ' Action', ' Adventure'],
  'score': 8.6,
  'title': 'Spider-Man: Into the Spider-Verse'}]
```

* Remove movie (Plain: Output, Bold: User input)
* Remove a document
  * > \[INFO] Enter the title which you want to remove
    >
    > Movie Title: **Happy Death Day 2U**
    >
    > \[INFO] Are you sure? (Y/n) **Y**
    >
    > \[INFO] Deleted
* Result

```python
[INFO] Movie list
Avengers: Infinity War
Bohemian Rhapsody
Spider-Man: Into the Spider-Verse 
```

## Index

### Exercise 1

1. Create an index in ascending order for the writer field and get the index information. :grinning:

   ```python
   {'_id_': {'key': [('_id', 1)], 'ns': 'lab6.blog', 'v': 2},
   'writer_1': {'key': [('writer', 1)], 'ns': 'lab6.blog', 'v': 2}} 
   ```

2. Again, print a query plan using executionStats corresponding to the query that finds a writer, Kim.

   ```python
    "executionStats" : {
   …
                   "executionTimeMillis" : 0,
                   "totalKeysExamined" : 1,
                   "totalDocsExamined" : 1,
                   "nReturned": 1,
                   "executionStages": {
                                 "inputStage": {
                                            "stage": "IXSCAN",
   …
   ```

3. Create the compound index on id in descending order and writer in ascending order.

   ```python
   {'_id_': {'key': [('_id', 1)], 'ns': 'lab6.blog', 'v': 2}, 
   'id_-1_writer_1': {'key': [('id', -1), ('writer', 1)], 'ns': 'lab6.blog', 'v': 2}, 
   'writer_1': {'key': [('writer', 1)], 'ns': 'lab6.blog', 'v': 2}} 
   ```

4. Drop all indexes on the blog collection.

   ```python
   {'_id_': {'key': [('_id', 1)], 'ns': 'lab6.blog', 'v': 2}}
   ```

### Exercise 2

* Index 생성

  * Create a unique index to doc\_id in ascending order
  * Try to build a unique index to line\_num
  * Create a sparse index to intersect in ascending order
  * Create a unique and sparse index to intersect\_id in ascending order
  * Check the index information

* Compare the results below two cases. (metro collection)

  * Show the first two documents, sorting with doc\_id in ascending order.

  * Using intersect index by hint(), show the first two documents

  * Sort doc\_id in ascending order

  * Use explain() above each case (executionStats).

  * No Index

    ```python
     "executionStats" : {
    …
    	 'executionTimeMillis': 155,
    	 'nReturned': 2,
    	 'totalDocsExamined': 186024,
    	 'totalKeysExamined': 0
    	}
    ```

  * Indexed

    ```python
     "executionStats" : {
    …
    	'executionTimeMillis': 21,
    	'nReturned': 2,
    	'totalDocsExamined': 2,
    	'totalKeysExamined': 2
    	}
    ```

## GeoSpatial Index

### Exercise 1

* Create a geospatial index (2dsphere) for the states collection
  * Get information on states collection’s indexes

    ```python
    {'_id_': {'key': [('_id', 1)], 'ns': 'geo.states', 'v': 2}, 
    'loc_2dsphere': 
    {'2dsphereIndexVersion': 3, 'key': [('loc', '2dsphere')], 'ns': 'geo.states', 'v': 2}}
    ```
* Remove the geospatial index from the states collection
  * Get information on states collection’s indexes

    ```python
    {'_id_': {'v': 2, 'key': [('_id', 1)], 'ns': ‘geo.states'}}
    ```

### Exercise 2

* Use the restaurant collection in the geo database

1. Create a valid query and print the query plan using executionStats•Find documents between '10200‘ and '10280‘ in zipcode
2. Create an index to zipcode (Hint: Embedded Document)
3. Execute #1, again.

* &#x20;No Index

  ```python
    "executionStats" : {
   …
                  "executionTimeMillis" : 11,
                  "totalKeysExamined" : 0,
                  "totalDocsExamined" : 25359,
                  "nReturned": 21,
  …
  ```

* Indexed

  ```python
   "executionStats" : {
  …
                  "executionTimeMillis" : 0,
                  "totalKeysExamined" : 21,
                  "totalDocsExamined" : 21,
                  "nReturned": 21,
  …
  ```

* Use the restaurant collection in the geo database

1. Create a query and print the query plan using executionStats•Find documents that have at least the grade is ‘B’ in the grades.grade field.
2. Create an index to grade (Hint: Indexing on Array)
3. Execute #1, again.

* No Index

  ```python
   "executionStats" : {
  …
                  "executionTimeMillis" : 17,
                  "totalKeysExamined" : 0,
                  "totalDocsExamined" : 25359,
                  "nReturned": 8280,
  …
  ```
* Indexed

  ```python
   "executionStats" : {
  …
                  "executionTimeMillis" : 13,
                  "totalKeysExamined" : 8280,
                  "totalDocsExamined" : 8280,
                  "nReturned": 8280,
  …
  ```

### Exercise 3

* Find international airports in the state that has the above zip code (10044)
  * Sorting the name in ascending order&#x20;
  * Print state’s name and airport name each.

    * Hint: states and small\_zip collections

    ```python
    [{'name': 'Albany County'}, 
    {'name': 'Greater Buffalo Intl'}, 
    {'name': 'Greater Rochester International'}, 
    {'name': 'John F Kennedy Intl'}, 
    {'name': 'La Guardia'}, 
    {'name': 'Massena Intl'}, 
    {'name': 'Niagara Falls Intl'}, 
    {'name': 'Ogdensburg Intl'}, 
    {'name': 'Stewart Intl'}, 
    {'name': 'Syracuse Hancock Intl'}, 
    {'name': 'Watertown International'}] 
    ```
* Find Korean restaurants that are located at less 2km from international airports.
  * Warn: should create 2dsphere index to address for restaurants collection.

![](/files/-LdoIPpZFkXDmZvkkOms)

* Find states that are intersect to the flight path (trajectory).&#x20;
  * Print names including source and destination states.&#x20;

![](/files/-LdoImMimuG7M5dZS3GK)

* Make Geojson format whose type is LineString.&#x20;

  * Find the two coordinates (LAX, DTW) Result

* Result

  ```python
  [{'name': 'Arizona'},
  {'name': 'California'},
  {'name': 'Colorado'},
  {'name': 'Illinois'},
  {'name': 'Iowa'},
  {'name': 'Kansas'},
  {'name': 'Michigan'},
  {'name': 'Nebraska'}, 
  'name': 'Nevada'},
  {'name': 'Utah'}]
  ```

### Exercise 4

* Using the restaurants collection.

* Find restaurants whose names include "Kimchi".&#x20;

* Print only name, zipcodeand address

  ```python
  { "zipcode" : 11231, "address" : "Smith Street 478" }
  { "zipcode" : 11238, "address" : "Washington Avenue 766" }
  { "zipcode" : 10036, "address" : "West 48 Street 18-20" }
  ```

* Using the restaurants and states collections.

* Find restaurants that have "NolbuRestaurant＂(restaurants collection)&#x20;

* Print the state (states collection)

  ```python
  New York
  ```

* Using restaurants.json&#x20;

* Among the "Korean" restaurants, find a restaurant that has "Ramen"but "Izakaya“.&#x20;

* Then, find a restaurant near above restaurant in 150m.

  ```python
  { '_id' : ObjectId('5c8626739f44c9f6768035da'),
    'address' : {'building' : '8', 'coord': [-73.9916305, 40.7247106],
                 'street': 'Extra Pl', 'zipcode': '10003'},
    'borough': 'Manhattan',
    'cuisine': 'Japanese',
    'grades': [{'date': datetime.datetime(2015, 1, 20, 0, 0),
                 'grade': 'Not Yet Graded',
                 'score': 21}],
    'name': 'Ko Ep, Llc',
    'restaurant_id': '50015854'}
  ```

## Text Index

### Exercise 1

* Using restaurants&#x20;
* Build a text index to the 'name' field.&#x20;
* Find restaurants whose names include 'Kyochon‘ Use text index&#x20;
* Print only borough and name fields

  ```python
  { "borough" : "Queens", "name" : "Kyochon Chicken" }
  { "borough" : "Manhattan", "name" : "Kyochon" }
  ```

### Exercise 2

* Take a look the index information in map and countries collections.

* Find universities in Korea.
  * Get geometry information in the countries collection.&#x20;
  * keyword -properties.admin: “South Korea”&#x20;

* Return ten documents with sorting by “properties.name”, and show the name only.&#x20;

  * Note that there is no ‘properties.name’ in a document; therefore, remove the document.

  ```python
  [{'properties': {'name': 'GIST 광주 과학 기술원'}},
   {'properties': {'name': 'ICT폴리택대학'}},
   {'properties': {'name': 'KAIST 문지 캠퍼스'}},
   {'properties': {'name': 'KC대학교'}},
   {'properties': {'name': 'KT원주연수원'}},
   {'properties': {'name': 'U1대학교 아산캠퍼스'}},
   {'properties': {'name': '가천대학교 글로벌캠퍼스'}},
   {'properties': {'name': '가천대학교 메디컬캠퍼스'}},
   {'properties': {'name': '가톨릭관동대학교'}},
   {'properties': {'name': '가톨릭대학교 성신교정'}}]
  ```

* Search the ‘properties.name’ that includes ‘서울대입구’&#x20;

* Using regular expression ($regex) or a text index.

* Sort properties.name in ascending order and display only a name field as below.

  ```python
  [{'properties': {'name': '서울대입구 삼성아파트'}},
   {'properties': {'name': '서울대입구 풍림아이원 1단지'}},
   {'properties': {'name': '서울대입구 풍림아이원 2단지'}}]
  ```

## Aggregation Framework

### Exercise 1

* Use restaurants.json&#x20;
* Using aggregation framework, print documents below.&#x20;

  1\. Find Korean cuisine&#x20;

  2\. Group by borough and count the cuisines.

  ```python
  [{'_id': 'Brooklyn', 'count': 16},
  {'_id': 'Queens', 'count': 167},
  {'_id': 'Staten Island', 'count': 1},
  {'_id': 'Manhattan', 'count': 78}]
  ```

### Exercise 2

* Using aggregation framework, print documents below.&#x20;

  1\. Find Korean cuisine&#x20;

  2\. Unwind grades&#x20;

  3\. Group by borough and grade in grades, and count the document&#x20;

  4\. Sort by count and print five documents

  ```python
  [{'_id': {'borough': 'Queens', 'grade': 'A'}, 'count': 443},
  {'_id': {'borough': 'Manhattan', 'grade': 'A'}, 'count': 232},
  {'_id': {'borough': 'Queens', 'grade': 'B'}, 'count': 140},
  {'_id': {'borough': 'Manhattan', 'grade': 'B'}, 'count': 49},
  {'_id': {'borough': 'Brooklyn', 'grade': 'A'}, 'count': 38}]
  ```

### Exercise 3

* Find international airports in California using aggregate framework

1. states collection&#x20;
   * Get the California location (Polygon)
2. airports collection&#x20;
   * Find “intl” in name field (text)&#x20;
   * Find airports in the California&#x20;
   * Project name, type and code (w/o \_id)&#x20;
   * Sort by name (ascending order) and code (descending order)

### Exercise 4

* Find Korean restaurants that are located at less 2km from international airports (use the type field).&#x20;

  * Warn: should create 2dsphere index to address for restaurants collection.

  ```python
  [{'_id': ObjectId('5c8626729f44c9f6767fe0af'),
  'address': {'building': '0',
               'coord': 497.9151037865528,
               'street': 'Jfk International Airpor',
               'zipcode': '11430'},
   'borough': 'Queens',
   'cuisine': 'Korean',
   'grades': [{'date': datetime.datetime(2014, 6, 18, 0, 0),
               'grade': 'A',
               'score': 2},
              {'date': datetime.datetime(2014, 1, 14, 0, 0),
               'grade': 'A',
               'score': 12},
              {'date': datetime.datetime(2013, 1, 4, 0, 0),
               'grade': 'A',
               'score': 11},
              {'date': datetime.datetime(2012, 4, 25, 0, 0),
               'grade': 'A',
               'score': 13}],
   'name': 'Korean Lounge',
   'restaurant_id': '40625198'}]
  ```

## Exercise 5

* Print the average, minimum, and maximum of passengers(ride\_pasgr\_numfield) of each subway line.&#x20;
* Hint: group by the ‘line\_num’ field and use $max, $min, and $avg.&#x20;
* Result

  ```python
  [{'_id': '경원선', 'avg_station': 7413.872106666667, 'max_station': 27076, 'min_station': 1},
  {'_id': '일산선', 'avg_station': 11100.640561622466, 'max_station': 25857, 'min_station': 1},
  {'_id': '1호선', 'avg_station': 26936.31375, 'max_station': 70147, 'min_station': 4815},
  ...
  {'_id': '수인선', 'avg_station': 3436.322596153846, 'max_station': 9023, 'min_station': 25},
  {'_id': '경강선', 'avg_station': 2294.190909090909, 'max_station': 9201, 'min_station': 192}]
  ```

## Exercise 6

* Print the total number of passengers (ride\_pasgr\_numfield) during 2017-12-21 \~ 2017-12-28.
* ‘use\_dt’ is a stringtype.&#x20;
* Hint: use $match,$gt$lt, $sum, $sort&#x20;
* Result

  ```python
  [{'_id': {'use_dt': '20171221'}, 'total_rider_pasgr': 8505895},
  {'_id': {'use_dt': '20171222'}, 'total_rider_pasgr': 8917017},
  {'_id': {'use_dt': '20171223'}, 'total_rider_pasgr': 6781294},
  {'_id': {'use_dt': '20171224'}, 'total_rider_pasgr': 4866330},
  {'_id': {'use_dt': '20171225'}, 'total_rider_pasgr': 5096990},
  {'_id': {'use_dt': '20171226'}, 'total_rider_pasgr': 7935515},
  {'_id': {'use_dt': '20171227'}, 'total_rider_pasgr': 7906780},
  {'_id': {'use_dt': '20171228'}, 'total_rider_pasgr': 8056509}]
  ```


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://koonkim.gitbook.io/mongodb/2.0/pymongo/exercises.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
