Wrangle the Google Maps CSV Result

March 19, 2025 (4mo ago)

Should have done this a long time ago.

  1. Receive the CSV File from Google Maps Result
  2. Merge Duplicated Entries. There are duplicates since each area might have overlapping result
  3. Save only the useful Fields
  4. Percentile on the most popular cafes.
  5. Save the Image Preview with credits.

<mark>PENDING OF JSON COMBINATION CHECK. SO AS NOT TO HAVE DUPLICATED RETRIEVAL OF IMAGES</mark>

Combine and Clean Up CSV Files

import pandas as pd
import json
import matplotlib.pyplot as plt
import seaborn as sns
import numpy as np


DATA_NAME = "tokyo_animal_cafe"

#  Combine all CSVs in a directory into a list of DataFrames
dfs = [pd.read_csv(file) for file in ['tokyo_animal_cafe_01.csv']]

# Concatenate and drop duplicates based on "title"
combined_df = pd.concat(dfs, ignore_index=True).drop_duplicates(subset="title", keep="first")

df = df[['title', "address", "gps_coordinates", "reviews", "types", "type", "website", "place_id", "rating"]]
df.rename(columns={"title": "name"}, inplace=True)
df.rename(columns={"gps_coordinates": "coordinates"}, inplace=True)

df["coordinates"] = df["coordinates"].apply(lambda x: [x["latitude"], x["longitude"]])
df.to_json(f"{DATA_NAME}.json", orient='records', lines=False, indent=4)

Retrieve Images