aster.cloud aster.cloud
  • /
  • Platforms
    • Public Cloud
    • On-Premise
    • Hybrid Cloud
    • Data
  • Architecture
    • Design
    • Solutions
    • Enterprise
  • Engineering
    • Automation
    • Software Engineering
    • Project Management
    • DevOps
  • Programming
    • Learning
  • Tools
  • About
  • /
  • Platforms
    • Public Cloud
    • On-Premise
    • Hybrid Cloud
    • Data
  • Architecture
    • Design
    • Solutions
    • Enterprise
  • Engineering
    • Automation
    • Software Engineering
    • Project Management
    • DevOps
  • Programming
    • Learning
  • Tools
  • About
aster.cloud aster.cloud
  • /
  • Platforms
    • Public Cloud
    • On-Premise
    • Hybrid Cloud
    • Data
  • Architecture
    • Design
    • Solutions
    • Enterprise
  • Engineering
    • Automation
    • Software Engineering
    • Project Management
    • DevOps
  • Programming
    • Learning
  • Tools
  • About
  • Architecture
  • Programming
  • Public Cloud

From Receipts To Riches: Save Money W/ Google Cloud & Supermarket Bills – Part 1

  • aster.cloud
  • May 8, 2023
  • 6 minute read

In today’s world, every penny counts, and saving money on supermarket spending is no exception. Have you ever wondered: 

  • How many packets of quinoa did you buy this year? 
  • How much have egg prices gone up? 
  • Why are grocery bills high in certain months? 
  • What’s the most expensive item you’ve ever bought at the supermarket?

This is the first part of a two-part blog series that demonstrates how simple it is to combine managed services on Google Cloud to create a complete application that digitizes your grocery receipts and analyzes your spending patterns using Google Cloud services. This architecture is useful for any number of use cases where you want to apply the power of document AI to fully event-driven processing pipelines.


Partner with aster.cloud
for your next big idea.
Let us know here.



From our partners:

CITI.IO :: Business. Institutions. Society. Global Political Economy.
CYBERPOGO.COM :: For the Arts, Sciences, and Technology.
DADAHACKS.COM :: Parenting For The Rest Of Us.
ZEDISTA.COM :: Entertainment. Sports. Culture. Escape.
TAKUMAKU.COM :: For The Hearth And Home.
ASTER.CLOUD :: From The Cloud And Beyond.
LIWAIWAI.COM :: Intelligence, Inside and Outside.
GLOBALCLOUDPLATFORMS.COM :: For The World's Computing Needs.
FIREGULAMAN.COM :: For The Fire In The Belly Of The Coder.
ASTERCASTER.COM :: Supra Astra. Beyond The Stars.
BARTDAY.COM :: Prosperity For Everyone.

In this first part of the blog series, we will discuss how to extract important information from your supermarket receipts, such as the date, store name, and items purchased, using Document AI. Storing this information in a Datastore allows you to access it quickly and easily. You can then use BigQuery to analyze the data and gain insights into your spending habits. This can help you identify areas where you can save money, such as by purchasing generic brands instead of name brands or buying in bulk. With this powerful combination of Google Cloud technologies, you can turn your receipts into a valuable data source that can help you save money and make smarter purchasing decisions. 

Don’t let your grocery receipts go to waste – turn them into riches with Google Cloud!

Architecture:

https://storage.googleapis.com/gweb-cloudblog-publish/images/1._Part1-ArchitectureDiagram.max-1200x1200.jpg

The following services will be used in this architecture:

  • Document AI: Document AI is a cloud-based AI tool that extracts structured data from unstructured documents, automates tasks, and saves businesses time and money. 
  • Cloud Functions: Google Cloud Functions is a serverless compute platform for scalable, event-driven applications.
  • BigQuery: BigQuery is a fast, easy-to-use, powerful serverless data warehouse for analyzing large amounts of data.
  • Cloud Datastore: Cloud Datastore is a fully managed, scalable, high-performance, durable, and secure NoSQL database for web, mobile, IoT, and big data applications.
  • Google Cloud Storage: Google Cloud Storage is a scalable, durable, and available object storage service for your data. It’s powerful, flexible, and cost-effective.
  • Cloud Logging: Cloud Logging is a fully-managed log management service. It collects, stores, and analyzes log data from GCP and on-premises.
Read More  PyCon 2019 | Syntax Trees and Python - Automated Code Transformations

Here are the steps to build this service: 

1. If you haven’t already, set up a Google Cloud account with Starter Checklist.

2. Set up Document AI Custom Document Extractor: 

a. This guide describes how to use Document AI Workbench to create and train a Custom Document Extractor that processes any document. 

b. Follow the same and create a custom processor that can process the invoice from your nearest supermarket that you visit most often (Document AI custom processors can support a wide range of languages). On average, we have observed ~15 receipts to train and to test each supermarket receipt model.

c. Processor schema should be as shown in the screenshot below. This schema covers all the information required on any supermarket bill, and most supermarket bills have these labels. Please note that if you change this schema, the code provided in step 4 below for Cloud Functions and BigQuery tables schema must be updated accordingly.

https://storage.googleapis.com/gweb-cloudblog-publish/images/2._DocAI_Schema.max-1500x1500.jpg

d. This Custom Document Extractor processor will be called within Cloud Functions to parse and identify any bill uploaded by the users via a web application or a phone application. 

e. Here is a sample invoice used in this application. You can train on any invoice format as long as the schema in step 2 (c) above remains the same. The invoice should have an ID, items sold, date, items with their cost, sales tax, and subtotal.Here is an example of an invoice that follows this schema:

https://storage.googleapis.com/gweb-cloudblog-publish/images/3._InvoiceExample.max-400x400.jpg

3. Set up Google Cloud Storage: Follow this documentation and create storage buckets to store uploaded bills. 

4. Set up Google Cloud Function: 

a. For instructions on how to create a second-generation Cloud Function with Eventarc trigger that will be activated when a new object is added to the cloud storage created in step 3, please see this documentation. Please note the following while creating a cloud function and adding Eventarc.

i. When creating the function, make sure that “Allow internal traffic and traffic from Cloud Load Balancing” is selected in the network settings.

Read More  Improved TabNet On Vertex AI: High-Performance, Scalable Tabular Deep Learning

ii. When adding an Eventarc trigger, choose the “google.cloud.storage.object.v1.finalized” event type.

iii. It is recommended to create a new service account instead of using the “Default Compute Service Account” when creating Eventarc. The new service account should be granted the “Cloud Run Invoker” and “Eventarc Event Receiver” permissions.

iv. When creating a Cloud Function, do not use the “Default Compute Service Account” as the “Runtime service account”. Instead, create a new one and grant it the following roles to allow it to connect to DataStore, BigQuery, Logging, and Document AI: BigQuery Data Owner, Cloud Datastore User, Document AI API User, Logs Writer, Storage Object Creator, and Storage Object Viewer.

v. Once done, service accounts and permissions should appear as follows:

https://storage.googleapis.com/gweb-cloudblog-publish/images/4._IAM_permissions.max-1000x1000.jpg

b. Next, open the Cloud Function you created and select .NET 6.0 as the runtime. This will allow you to write C# functions.

c. To ensure a successful build, replace the current code in the Cloud Function with this code, replacing the variables in the code with the appropriate Google Cloud service names. Replace content in HelloGcs.csproj and Function.cs files.

d. Deploy the Cloud Function.

e. This code creates two types (Kind) of entities, “Invoices” and “Items”, in Datastore. Please refer to this documentation for more information on Datastore.

f. This code also creates a new LogName under global scope. Please refer to this documentation for more information on creating custom log entries. 

5. Set up BigQuery:

a. To create a new BigQuery dataset, please follow these instructions.

b. To create a new table called “Invoices” in the dataset created in step (a) above, follow the instructions in this documentation. In the Schema section and use the following JSON data as the “schema definition” for the “Invoices” table:

[
  {
    "name": "Items_sold_per_invoice",
    "mode": "NULLABLE",
    "type": "INTEGER",
    "description": null,
    "fields": []
  },
  {
    "name": "Purchase_Date",
    "mode": "NULLABLE",
    "type": "DATETIME",
    "description": null,
    "fields": []
  },
  {
    "name": "Sales_Tax",
    "mode": "NULLABLE",
    "type": "FLOAT",
    "description": null,
    "fields": []
  },
  {
    "name": "SubTotal",
    "mode": "NULLABLE",
    "type": "FLOAT",
    "description": null,
    "fields": []
  },
  {
    "name": "InvoiceID",
    "mode": "NULLABLE",
    "type": "BIGNUMERIC",
    "description": null,
    "fields": []
  }
]

When finished, the “Invoices” table and its schema should resemble the following:

https://storage.googleapis.com/gweb-cloudblog-publish/images/5._BQ_InvoicesTable.max-1700x1700.jpg

c. Please repeat the previous instructions, but this time create a new table called “Items.”

Read More  APAC Startups, Developers And Talents Embrace Accelerated Digitalization With Alibaba Cloud

d. To create a new table called “Items” in the dataset created in step (a) above, follow the instructions in this documentation. In the Schema section and use the following JSON data as the “schema definition” for the “Items” table:

[
  {
    "name": "Item_code",
    "mode": "NULLABLE",
    "type": "STRING",
    "description": null,
    "fields": []
  },
  {
    "name": "Item_cost",
    "mode": "NULLABLE",
    "type": "FLOAT",
    "description": null,
    "fields": []
  },
  {
    "name": "InvoiceID",
    "mode": "NULLABLE",
    "type": "STRING",
    "description": null,
    "fields": []
  }
]

When finished, the “Items” table and its schema should resemble the following:

https://storage.googleapis.com/gweb-cloudblog-publish/images/6._BQ_ItemsTable.max-1500x1500.jpg

6. Testing:

a. To test the application, upload the supermarket bill to the same cloud storage bucket that you created in step 3 and configured to “Receive events from” in Eventarc in step 4.

b. When a supermarket bill is uploaded to the blob storage, it triggers a cloud function that reads the file, processes it with a Document AI custom processor, extracts the relevant fields, and then stores the data in three different destinations: BQ tables (created in step 5 above), Datastore (two entities named “Invoices” and “Items” will be created), and a new text file per uploaded document with all extracted entities for archiving purposes.

There are many ways to explore data in BigQuery, and one of them is to explore data in Looker from BigQuery tables. Here are some of the sample reports you can build with data:

https://storage.googleapis.com/gweb-cloudblog-publish/images/7._InvoicesReport.max-1900x1900.jpg
https://storage.googleapis.com/gweb-cloudblog-publish/images/8._ItemsReport.max-1700x1700.jpg

Congratulations on digitizing your supermarket bills! You are one step closer to riches.

In the second part of this blog series, we will discuss how this architecture can be expanded to every supermarket bill in the market and how any supermarket bill can be digitized. So stay tuned!! 

In the meantime, please review Google Cloud’s tutorials to learn more about the platform and how to use it effectively.

By Krishna Chytanya Ayyagari Senior Customer Engineer, Infrastructure
Originally published at Google Cloud

Source: Cyberpogo


For enquiries, product placements, sponsorships, and collaborations, connect with us at [email protected]. We'd love to hear from you!

Our humans need coffee too! Your support is highly appreciated, thank you!

aster.cloud

Related Topics
  • BigQuery
  • Cloud Datastore
  • Cloud Logging
  • Document AI
  • Google Cloud
You May Also Like
View Post
  • Computing
  • Public Cloud
  • Technology

United States Army Enterprise Cloud Management Agency Expands its Oracle Defense Cloud Services

  • April 15, 2025
DeepSeek R1 is now available on Azure AI Foundry and GitHub
View Post
  • Public Cloud
  • Technology

DeepSeek R1 is now available on Azure AI Foundry and GitHub

  • February 2, 2025
Cloud platforms among the clouds
View Post
  • Computing
  • Learning
  • Public Cloud

Best Cloud Platforms Offering Free Trials for Cloud Mastery

  • December 23, 2024
Vehicle Manufacturing
View Post
  • Hybrid Cloud
  • Public Cloud

Toyota shifts into overdrive: Developing an AI platform for enhanced manufacturing efficiency

  • December 10, 2024
IBM and AWS
View Post
  • Public Cloud

IBM and AWS Accelerate Partnership to Scale Responsible Generative AI

  • December 2, 2024
COP29 AI and Climate Change
View Post
  • Public Cloud
  • Technology

How Cloud And AI Are Bringing Scale To Corporate Climate Mitigation And Adaptation

  • November 18, 2024
Cloud Workstations
View Post
  • Public Cloud

FEDRAMP High Development in the Cloud: Code with Cloud Workstations

  • November 8, 2024
View Post
  • Public Cloud

PyTorch/XLA 2.5: vLLM support and an improved developer experience

  • October 31, 2024

Stay Connected!
LATEST
  • college-of-cardinals-2025 1
    The Definitive Who’s Who of the 2025 Papal Conclave
    • May 7, 2025
  • conclave-poster-black-smoke 2
    The World Is Revalidating Itself
    • May 6, 2025
  • oracle-ibm 3
    IBM and Oracle Expand Partnership to Advance Agentic AI and Hybrid Cloud
    • May 6, 2025
  • 4
    Conclave: How A New Pope Is Chosen
    • April 25, 2025
  • Getting things done makes her feel amazing 5
    Nurturing Minds in the Digital Revolution
    • April 25, 2025
  • 6
    AI is automating our jobs – but values need to change if we are to be liberated by it
    • April 17, 2025
  • 7
    Canonical Releases Ubuntu 25.04 Plucky Puffin
    • April 17, 2025
  • 8
    United States Army Enterprise Cloud Management Agency Expands its Oracle Defense Cloud Services
    • April 15, 2025
  • 9
    Tokyo Electron and IBM Renew Collaboration for Advanced Semiconductor Technology
    • April 2, 2025
  • 10
    IBM Accelerates Momentum in the as a Service Space with Growing Portfolio of Tools Simplifying Infrastructure Management
    • March 27, 2025
about
Hello World!

We are aster.cloud. We’re created by programmers for programmers.

Our site aims to provide guides, programming tips, reviews, and interesting materials for tech people and those who want to learn in general.

We would like to hear from you.

If you have any feedback, enquiries, or sponsorship request, kindly reach out to us at:

[email protected]
Most Popular
  • 1
    Tariffs, Trump, and Other Things That Start With T – They’re Not The Problem, It’s How We Use Them
    • March 25, 2025
  • 2
    IBM contributes key open-source projects to Linux Foundation to advance AI community participation
    • March 22, 2025
  • 3
    Co-op mode: New partners driving the future of gaming with AI
    • March 22, 2025
  • 4
    Mitsubishi Motors Canada Launches AI-Powered “Intelligent Companion” to Transform the 2025 Outlander Buying Experience
    • March 10, 2025
  • PiPiPi 5
    The Unexpected Pi-Fect Deals This March 14
    • March 13, 2025
  • /
  • Technology
  • Tools
  • About
  • Contact Us

Input your search keywords and press Enter.