Akash Milton
AkashMilton
Published on

Tech for Non-tech folks

Authors
Cover Image

In 2024, you might wonder why anyone should bother reading this when GPTs are so advanced. You can ask them any question, and they’ll respond with tailored, well-crafted answers. That’s an undeniable truth. While GPTs are exceptional tools for getting you unstuck, true ideation—the ability to spark meaningful ideas—comes only from prior knowledge and understanding. And as of today, the only way to gain that understanding is to engage with the material yourself.

This article is written for people who aren't developers but work around technology. You might be a product manager, support agent, marketer, salesperson, or pre-sales professional at a tech company—what I refer to as "non-tech" roles. This term isn’t meant to be derogatory but simply a way to describe those not directly involved in development.

By narrowing the focus, I’ve stripped away unnecessary details to give you surface-level knowledge that’s practical and actionable. The goal is to help you better understand the possibilities and limitations of technology, specifically Information Technology with a focus on the web. This understanding can make your job easier and your communication with technical colleagues more frictionless.

While the title may seem broader due to the word “tech” (because it flows better), the content is deliberate and focused. The article is long for a reason—feel free to skip or skim where needed.

The Spark

It’s a fine Saturday morning. You have no plans, and all those years of optimizations have finally paid off—you suddenly have time to kill. Life’s good, but there’s an itch, a feeling that you want to do something.

You think back to yesterday, to how hectic it was when you needed a printout for a government office. The whole process flashes before you: you walked to the nearby printer/xerox shop, shared your document via email or WhatsApp, and watched as the shopkeeper kept refreshing his inbox. Once he finally downloaded and printed it, you hesitated, awkwardly asking him to delete the file—from his computer, from his email, and even from his trash.

It hits you. This is exactly how it was 10 years ago. Nothing has changed. No improvement. No progress.

What you see isn’t just a pain point; it’s an opportunity. Why hasn’t anyone fixed this? You don’t print often enough to justify owning a printer, but the inconvenience still nags at you. It’s all aligning in your head—the problem, the need, the possibility. You’re being touched by something others seem to overlook, and suddenly it feels like your very own Newton’s apple moment.

Excitement builds. The idea takes root. You can see the solution, the impact. But wait—there’s just one small problem.

You have no idea how to do any of this.

And yet, you’re here. At the start of something.

'I told you so'

You meet your engineering friends, buzzing with excitement, and pitch your idea. You expect them to light up, to see the potential. Instead, they shake their heads and dismiss it outright.

“Why fix printing?” they say. “Printing itself is outdated. If we solve the receiver’s side, there wouldn’t be a need to print at all.”

Classic developer mentality—always thinking like we live in the United States of Utopia, skipping straight to the ideal world rather than taking small, meaningful steps that actually move mankind forward.

But you’re not here to change the world. You’re here to slide in, to seize this small opportunity no one else seems to care about. You don’t need to eliminate printing; you just need to make it work better. You can see it clearly—a Swiggy of printing.

This little setback isn’t going to slow you down. If anything, it fires you up. For now, you decide to roll up your sleeves and build the minimum viable product yourself. The vision is clear: once this grows into the Swiggy of printing, you’ll deliver a printed “See? I told you so” straight to their doorsteps just like Mark Zuckerberg's "I'm CEO, Bitch" card.

Landing Page

Building a product takes time—a long time. To make matters trickier, you’re not even sure exactly what you want to build. But you know one thing: you can start small. A basic landing page is enough for now. It’s a first step to attract vendors and visitors, have meaningful conversations, and shape the product based on real feedback. And hey, who knows? Maybe an investor might stumble across it. Luck may or may not knock on your door, but the more doors you build, the better your chances.

You could use platforms like Wix, Squarespace, or WordPress to whip up a website quickly—those tools are easy and efficient. But since you’re committed to learning how to build an app yourself (and for the sake of this blog), you decide to build it from scratch.

Let’s step back for a moment. At its core, a computer does three things: it stores data, retrieves data, and processes data to generate new data. Any device that performs these tasks—a phone, a watch, a calculator—is essentially a computer.

Now, what we call the Internet is simply the connectivity between these computers, allowing them to share and communicate data.

So here’s the plan: first, let’s get a webpage up and running on your computer. Once that’s ready, we’ll figure out how to make it accessible to other computers around the world. Step by step, we’ll bring this idea to life.

To create your webpage, you need to write its content using HTML (HyperText Markup Language). Unlike programming languages, where you provide a sequence of instructions to be executed, markup languages are used to structure data or content in a hierarchical way. This structure is then interpreted by other programs, like web browsers, to display it.

These hierarchies aren’t just for humans; they also help search engine crawlers understand your webpage. The better your structure, the easier it is for search engines to index your content and display it in search results.

Go to Google Forms and find the embed code option. Copy the code (which is in HTML format). Paste it directly into your webpage’s HTML file.

This allows your form to appear seamlessly within your webpage.

You’ve probably seen the prefix http in many website URLs. It stands for HyperText Transfer Protocol. This protocol defines how hypertext (like HTML files) is transferred over the Internet from one place to another.

When websites communicate using this protocol, it’s often called a network call or HTTP call.

  • http: The basic version of the protocol.
  • https: A secure version where the communication between your browser and the server is fully encrypted. This ensures no intermediary (like a hacker or a malicious program) can read or tamper with the data.

If you visit a site without https, your browser may warn you because the connection isn’t secure. In today’s web, https is the standard for any professional or trusted site.

When you first load your webpage, you’ll notice that the browser applies some basic styling to the HTML content by default. While it works, it looks plain and unpolished. To make the page visually appealing with colors, fonts, and spacing, you need to add a stylesheet using CSS (Cascading Style Sheets).

CSS allows you to control how HTML elements look—like their color, size, position, and layout—making your webpage much more attractive and user-friendly.

Now, there’s one small requirement: users should only see the Google Form when they click on a “Contact Us” button or link on the same page. This is where JavaScript comes in.

JavaScript is the programming language that browsers use to make webpages interactive. With it, you can show or hide elements, handle user actions like clicks, and even add animations.

With HTML for content, CSS for styling, and JavaScript for interactivity, your page is now complete. You’ve written everything from scratch, and it’s running perfectly on your computer.

The next step? Making it accessible to the rest of the world.

By tinkering with some configuration, you can expose your webpage to the world. For example, if your computer’s IP address is 203.185.33.25 and you’re exposing port 3000, anyone typing
http://203.185.33.25:3000
into their browser can access your webpage. Essentially, you’re making your computer act as a server for the site.

What’s a server?
A server isn’t someone—it’s just a name for the role a computer plays in this setup: serving the website’s content to those who request it.

Problems with Self-Hosting

While this works in theory, it comes with a few problems:

  1. Reliability: If your computer shuts down or your internet connection goes off, people won’t be able to reach your webpage.
  2. Performance: Your computer isn’t powerful enough to handle a lot of visitors (high traffic).
  3. Static IP: Most of our internet connections use dynamic IP addresses that change frequently. To host your site properly, you’d need a static IP from your Internet Service Provider (ISP).

The Solution: Hosting Services

Instead of hosting the webpage yourself, you can ask someone with all the resources to do it for you. This is called hosting the site. Hosting providers take care of:

  • Reliability (their servers are always on)
  • Scalability (they can handle high traffic)
  • Proper networking (no need for a static IP).

For simple use cases like yours, providers like Netlify or Vercel are great options.

Using a Content Delivery Network (CDN)

When you host your site with platforms like Netlify or Vercel, they don’t just keep your site in one physical location. Instead, they use a Content Delivery Network (CDN).

A CDN keeps copies of your website in multiple geographical locations. This improves speed and reduces network load.

For example:

  • Imagine Amazon’s main server is in Mumbai, but they have CDN servers in cities like Chennai.
  • If someone from Chennai wants to access a file, they don’t have to fetch it all the way from Mumbai. The file is served directly from Chennai, making it faster.

This approach keeps websites fast, accessible, and reduces traffic on the main server.

Your Site is Live!

For now, you’ve uploaded your source files through the hosting provider’s UI, and your webpage is live, accessible from anywhere in the world. You’ve taken your first steps toward building a real, working product.

"What's in a (domain) name?"

Now that hosting is done, you can share the host's IP address with the port number to anyone. But let’s be honest—doesn’t that look weird? Long strings of numbers and ports are hard to remember, and to most people, they might even look like spam links.

To fix this, it’s a general practice to mask the address with a name. This name is called a domain name. A domain name acts as a one-way pointer to your host's address (IP). There’s no limit to how many domain names can point to the same IP address, and you don’t need any approval from the host to do this.

An organization called ICANN holds the rights to all domain names globally. You can lease one through your favorite domain registrar (e.g., GoDaddy, Namecheap) and even transfer it between registrars if needed—similar to how phone numbers belong to a common pool and can move between telecom networks.

The last part of a domain name, like .com, .in, or .app, is called the Top-Level Domain (TLD). These are just namespaces and don’t carry much technical significance. You registered the domain paperpaste.app, which means you now own both the main domain (paperpaste.app) and all its subdomains (example.paperpaste.app, example2.example.paperpaste.app, and so on).

In the earlier days of the web, people often created a subdomain like www (e.g., www.example.com) and redirected the plain domain to it. This is largely unnecessary now, as the main domain itself works fine.

Now you need to point this domain to your hosting provider’s IP address. This connection is handled by the Domain Name System (DNS). DNS acts as the bridge between the domain name and the server IP. When someone types your domain name into their browser, DNS ensures they’re taken to the correct server.

Luckily, basic DNS management comes free with most domain registrars, so you don’t need to worry about setting up a separate system. DNS not only helps point your domain and subdomains to specific IPs but also lets you configure where emails sent to your domain (e.g., contact@paperpaste.app) should go. It can even verify your ownership of the domain when required, such as for hosting providers or analytics tools.

All that’s left now is to configure the DNS with your hosting provider’s IP. Once done, anyone typing your domain name into their browser will be taken to your website.

Congrats, you’re now running a basic website that’s accessible to the world!

Mail

To communicate professionally, using a custom email address linked to your domain is a great idea. It enhances authenticity and credibility. Similar to hosting and domain configuration for websites, custom email setup involves specific configurations as well. Here's how to get started with an email service:

  • To send emails:

    1. Add the service's provided token to your DNS. This allows the email receiver to verify that the service has authority over your domain.
    2. Include a DKIM (DomainKeys Identified Mail) record, which acts as a digital signature for your emails.
  • To receive emails:
    Add the email service's MX (Mail Exchange) record to your DNS. This ensures that incoming emails are routed to your chosen email service's inbox.

With this setup, you can now use a professional email address, like sales@paperpaste.app.

SEO

While you can share your link with people and groups to generate leads, the most effective way is to attract users who are actively searching for related topics. This works like free advertising.

If your site gets mentioned on another website, Google's crawler will eventually discover it. However, since your site is new, you'll need to submit it manually. By doing so, you also gain access to search analytics.

To submit your site, you must prove domain ownership by adding specific records and a provided verification text to your DNS. Once these changes are reflected in your DNS, the system will recognize that the domain belongs to you.

Analytics

Without analytics, you wouldn't know if anyone is visiting your website. To address this, you can set up basic analytics using tools like Google Analytics or similar platforms. Typically, these services provide a code snippet to embed in your HTML.

The snippet, though obfuscated, is simple in principle:

  1. It instructs the browser to download a JavaScript file from a URL.
  2. The browser executes the downloaded code.

This marks a significant shift for your website. Previously, your site was static and read-only. By adding this snippet, your website now interacts with external systems and logs visitor data, representing a technical leap.

Humans communicate through language, while humans and applications primarily interact via graphical user interfaces (GUIs). Similarly, applications communicate with one another using APIs (Application Programming Interfaces).

Analytics tools expose APIs to let websites report visitor and event data. APIs are conceptual frameworks and can be implemented in various ways, with REST APIs being one of the most popular. REST APIs use HTTP calls in a specific format to transfer information.

For instance, an API request to log a visitor might look like this:
POST analytics.google.com/pm/collect
This request would include visitor details and other relevant data.

The code snippet you embed might look different because the service packages all the necessary API calls and logic into a JavaScript file. This pre-written code simplifies the implementation by acting as a wrapper for the API. Such pre-built modules are called SDKs (Software Development Kits).

  • APIs are language-agnostic, meaning they work across different programming languages.
  • SDKs, however, are built for specific programming languages and provide ready-to-use tools for integrating with APIs.

While there are pros and cons to using SDKs, they are designed to simplify the process, enabling websites and apps to easily interact with external services like analytics platforms.

Acutal App

In our focus on marketing, we overlooked the need to build the actual product. Now is the perfect time to start and we need to first decide how the distribution of app is going to be as its going to influence our revenue model. Here are the two main approaches we can consider:

1. On-Premises Software

In this model, we sell the software as a product that customers buy, install, and host themselves. It enables them to set up their own site where others can place orders.

  • Revenue Model: Customers pay a significant one-time fee upfront. We might also charge for updates or ongoing support.
  • Expenses: Limited primarily to development costs, as hosting and maintenance are handled by the customer.
  • Characteristics:
    • Similar to selling a physical product.
    • Less common today but helps understand its opposite: SaaS.

2. Software as a Service (SaaS)

Here, we host and maintain the software ourselves. Customers can sign up and use the platform without needing to handle installations, updates, or maintenance.

  • Revenue Model: Subscription-based, allowing us to charge recurring fees.
  • Expenses: We bear hosting, storage, and maintenance costs.
  • Characteristics:
    • Simplifies the user experience for both parties.
    • Ideal for building a marketplace, as we retain control over the platform and ensure seamless service.

Since we aim to build a marketplace, the SaaS model is the better fit. Hosting the platform ourselves enables greater flexibility, consistent updates, and the ability to scale efficiently.

Unlike our basic website, our app involves significant hosting requirements:

  • Storage: To save user files and data.
  • Data Transfer: To handle uploads, downloads, and interactions.
  • Computing: To process orders, run algorithms, and manage workflows.

It’s crucial to factor in these hosting costs as they will significantly impact our operational expenses. Proper planning ensures we remain cost-effective while providing a seamless user experience.

Cloud Pricing

Just like office spaces can be rented at various levels—renting an entire floor, individual rooms, or even paying per seat—cloud services come in different levels as well. The most popular ones include:

  • IaaS (Infrastructure as a Service):
    You get a virtual full computer to use however you want. This is the most flexible and widely used model.

  • BaaS (Backend as a Service):
    A managed backend service that provides pre-built APIs and functionalities like user authentication, databases, and more.

  • FaaS (Function as a Service):
    You deploy small, specific functions that are triggered by events, with the cloud provider managing the infrastructure.

  • PaaS (Platform as a Service):
    Provides a platform to develop and deploy applications without managing the underlying hardware or operating systems.

The difference between these layers lies in the balance of freedom, price, and responsibility. More control (like in IaaS) means more freedom but also more responsibility and cost.

Cloud Pricing Basics

Cloud pricing is determined by three key factors, ranked by cost:

  1. Computing: The cost of processing power, which is typically the most expensive.
  2. Bandwidth: The cost of transferring data to and from the cloud.
  3. Storage: The cost of saving data in the cloud.

Understanding these factors is essential for planning and managing cloud expenses effectively.

Computing

Computing refers to the processes performed by the system whenever it runs. Every time you fetch data from a database, count records, or respond to an API request, you're utilizing computing resources.

Bandwidth

Just like individuals pay for the internet they use, servers also incur charges for data transfer. For example, if someone uploads or downloads a 1 GB file, the server pays for that bandwidth usage. A rough estimate is ₹7/GB, which is similar to what you might pay for mobile data.

Storage

Storage is relatively inexpensive but comes with tiers based on access frequency:

  • For files that are rarely accessed and can tolerate delays, slower retrieval storage is cheaper.
  • However, remember that transferring files to storage also incurs bandwidth costs, even though storage itself is cheap.

Back of the Envelope

Since we will be deleting files periodically or after the print action is completed, long-term storage costs are not a concern. However, we do need to calculate bandwidth costs carefully, as files will be uploaded, stored, and retrieved frequently.

Now that we have a foundational understanding of infrastructure, it's time to focus on building the app. Essentially, we need to transform our website into a fully functional web app. While we often use "website" and "web app" interchangeably, the difference lies in:

  • Who generates the content: Websites primarily display static or pre-generated content, while web apps are dynamic and rely heavily on user input to generate content.
  • Read-Write Ratio: Websites are read-heavy, whereas web apps involve more write operations due to interactivity and data updates.

Transitioning to a web app involves significantly more effort and complexity, as you'll need to handle numerous processes and interactions to ensure the app works seamlessly.

Good Guys Close In

Two of your developer friends have decided to join you—not because they’re chasing potential wealth or fear missing out, but because they recognize your dedication and want to help. They’ve always wanted to build something meaningful from the ground up rather than tweak minor features of a large, bloated application.

For them, the worst-case scenario is losing some time and money but walking away with valuable memories, learning, and experience. Now it’s time to roll up your sleeves and dive into some real action.

A web app consists of two main parts:

  • Frontend: Handles everything displayed and executed on the client’s machine. For example, a previously built website focused solely on frontend tasks.
  • Backend: Operates on the cloud server. It manages data storage, user authentication, and executes specific actions. The backend exposes APIs that the frontend uses for communication. If the frontend is like the rocket, the backend is the base station.

When a user places an order through the frontend, it calls a backend API. The backend could write the order details into a text file. Similarly, when a printer's frontend requests the latest orders, the backend reads from this text file and sends the data, completing the information cycle.

However, as the app grows, managing text files becomes challenging—like handling storage, chunking, or modifying data efficiently. This is where databases come in. A database, though ultimately storing data in files or memory, is optimized for speed and can handle millions of records in milliseconds.

Databases are broadly categorized into two types:

  1. SQL (Structured Query Language): Examples include MySQL, PostgreSQL, and MariaDB.
  2. NoSQL (Not Only SQL): Examples include MongoDB, DynamoDB, Firestore, CockroachDB, and CouchDB.

Each type has its technical advantages, and the choice depends on specific use cases. Databases can be installed on your cloud server or outsourced, where you pay based on time or usage.

Version Control

Previously, we zipped the code and uploaded it to the cloud for deployment. While this worked temporarily, it won't be practical in the future as collaboration increases. Managing multiple contributors, reviewing changes, and reverting modifications will become essential. To address these needs, a version control system is necessary.

There are various tools for version control, such as Git, Subversion, Perforce, and Mercurial. Among them, Git is the most popular and widely used.

Git uses algorithms to detect differences between lines of code, making it easy to identify which line in which file has changed. It also highlights conflicts when two versions have overlapping unsynchronized changes. You can manage multiple versions of the codebase and maintain multiple copies across different machines. Typically, there is a central repository where contributors publish their latest changes.

Instead of managing the central machine for storing the codebase yourself, you can use services like GitHub or GitLab. Similar to how cloud services simplify server management, these platforms provide managed solutions for centralized codebase storage. They also offer additional features such as code reviews, authentication, and more.

Deployment

With Git set up, you no longer need to manually zip and upload code to reflect changes. Instead, you can configure triggers to automatically fetch the latest code whenever changes are published (referred to as pushing the code). The system can then build the updated version seamlessly.

However, there might be situations requiring manual intervention, such as changing the system architecture or migrating data. These scenarios typically involve more complex adjustments beyond the usual automated workflow.

The Product

While a user could simply email a document, pay via UPI, and have the printer deliver the printed pages through a service like Dunzo, the question arises: why would they need our application? The answer lies in convenience and conformity. Just as Gmail and Google Sheets could theoretically replace many apps but don’t, the usability and streamlined experience of an app are what keep users engaged.

Usability is primarily about UI (User Interface) and UX (User Experience).

  • UI: Focuses on what is presented to the end user. For example, a beautifully designed "Upload" button with an icon and help text saying "Only PDFs and Images allowed" falls under UI. UI determines how things look.
  • UX: Focuses on how the application behaves and interacts with the user. For instance, if the file picker only shows files that meet the allowed criteria when the user clicks the upload button, this behavior belongs to UX. UX determines how things work.

Together, great UI and UX enhance the app's ease of use and user satisfaction.

Security is another critical aspect. Two important concepts here are:

  1. Authentication: Verifying that the user is who they claim to be.
  2. Authorization: Determining what rights or permissions the user has. For example, while every user can read a blog post, only the author can edit or delete it.

Combining excellent usability with strong security ensures the product stands out and provides genuine value to its users.

Authentication

Authentication begins when the user inputs their username and password. Our servers compare these credentials with the database records, and if they match, the server generates a token to identify the session. This token can be used in subsequent communications to verify the user’s identity without requiring them to log in again.

Cookies and Session Storage

Information like tokens and other user-specific data can be stored in the browser as cookies:

  • First-party cookies: Set by the website itself (the host in the URL).
  • Third-party cookies: Set by external resources, such as scripts or services used by the website. Modern browsers regulate third-party cookies to prevent user tracking.

Single Sign-On (SSO)

Typing usernames and passwords repeatedly is inconvenient. To improve the user experience, we can implement SSO (Single Sign-On). With SSO, if a user has already authenticated with a trusted external service, we accept that service’s verification and authenticate the user.

For example, signing in with Google means Google provides only basic profile information like name and email. However, if our app needs additional access, such as sending emails via Gmail, this requires additional permissions through OAuth protocols, which allow secure and limited access to the user's external account.

Logistics

For delivery, we have partnered with a third-party service, Shiprocket, which offers a hyperlocal delivery system. After an order is placed, we make an API call to Shiprocket, providing the printer’s location and the customer’s address. This creates a delivery task, and Shiprocket shares a tracking ID with us to monitor further details.

Webhooks

To avoid repeatedly asking for delivery status updates, we use webhooks. Webhooks allow us to provide our API details to Shiprocket and request a callback whenever a specific event (like status updates) occurs. This eliminates the need for continuous polling.

Websockets

A similar concept exists for real-time communication between the server and the browser, but this uses WebSockets, which is an entirely different protocol from webhooks. WebSockets enable the server to push updates directly to the browser whenever an event occurs, allowing for real-time interaction.

Payment Integration

With the delivery system complete, we integrated third-party payment gateways to handle transactions. With these elements in place—printing, delivery, and payment—the full workflow is ready, providing a seamless end-to-end solution.

Cron Jobs

Our delivery app has a significant challenge: delivery costs are higher than printing costs. Unless users place larger orders, passing delivery costs onto them may deter smaller orders, reducing overall engagement. To solve this, we can optimize delivery by batching orders instead of processing them individually.

Batching Deliveries

Paper is a lightweight, low-volume commodity, and a single person can easily handle multiple orders. Instead of making an API call to Shiprocket immediately after an order is placed, we can run a program periodically to group undelivered orders and process them as a batch.

This periodic execution is called a Cron Job, derived from the Greek word chronos, meaning time. For instance, a cron job can be scheduled to run hourly.

How It Works

  1. Every hour, the program fetches all undelivered orders.
  2. Orders are grouped based on location using algorithms like K-means or mean-shift for efficient clustering.
  3. Once grouped, an API call is made to Shiprocket with all the details for a batched delivery. (Assume we have special permission for multi-drop-off requests).

By batching deliveries, we optimize logistics, reduce costs, and encourage smaller orders, creating a more user-friendly and efficient system.

Artificial Intelligence

Even though the application is functional, with printing shops onboarded and users engaging with it, scaling up requires significant capital. Similar to how film producers demand dance numbers and fight scenes in a movie to make it commercially appealing, venture capitalists often look for an AI element in applications before providing funding.

While our product might not necessarily need AI, we can still integrate it in meaningful ways to add value and cater to expectations

The calculation of the printing cost is straightforward:

total_amount = no_of_pages * cost_per_page

In this equation:

  • no_of_pages represents the number of pages to be printed.
  • cost_per_page is set arbitrarily by the shop owner and is assumed to be fixed.

Given these values, the total cost can be calculated directly. This equation is linear (i.e., it forms a straight line when plotted mathematically). Because the relationship is well-known and predetermined, no advanced algorithms or AI are needed to compute the total amount.

AI becomes relevant when the formula or the relationship itself is not predefined and needs to be dynamically determined at runtime. For example, if we had to generate the graph or curve for a relationship like this dynamically using patterns in real-time data, then an AI algorithm might be necessary.

Let’s take the case of calculating the time to deliver:

time_to_deliver = no_of_pages * time_for_each_page

In practice, this relationship isn’t so straightforward. The actual time to deliver may depend on several factors, such as:

  • Time of the day.
  • Availability of the logistics partner.
  • Number of orders and pages in the queue ahead.
  • Additional incoming orders for batch deliveries.

These variables create a dynamic and complex scenario where the time to deliver can’t be determined by a simple formula. In such cases, AI algorithms can be used to analyze these factors and predict delivery times more accurately.

Machine Learning (ML) for Delivery Time Prediction

In scenarios where multiple variables are involved, the problem isn't just the number of variables but also the vagueness and reliance on past data. This is where Machine Learning (ML), a subfield of AI focused on observing patterns, becomes highly useful.

The system will use mathematical models to create a multivariate graph for delivery time against factors like:

  • Time of the day.
  • Logistics partner availability.
  • Number of orders in the queue.
  • Additional orders for batch deliveries.

This graph, also called a "model," will be:

  1. Stored for future use.
  2. Constantly updated with new data.

Using this model, the system will predict delivery times for future orders accurately.

Adding Generative AI: A Front Page Feature

To introduce Generative AI, we can add a feature for users who print many pages. The system can generate a custom front page that includes:

  • A title,
  • An image, and
  • A summary of the printing content.

Benefits of the Front Page:

  • Adds an extra page to the order, increasing billing.
  • Helps delivery partners identify the content easily.
  • Protects the actual first page from dirt or damage.
  • Offers the option for color printing on the front page, which can be priced at 10x the regular cost.

This add-on gives the product an edge over regular printing services and provides an innovative use case for generative AI.

Natural Language Processing (NLP) and Large Language Models (LLM)

In the delivery time model, the output was simple—a single number. However, generating a front page for printing involves producing a series of texts, which is a more complex task. The system doesn’t just deal with raw data but must process both inputs and outputs in human language.

This task falls under the field of Natural Language Processing (NLP). Fortunately, there are Large Language Models (LLMs) pre-trained on vast amounts of text data available online. These models are:

  • Designed to predict probable next words, creating human-like sentences.
  • Capable of being fine-tuned for specific use cases.
  • Even extended to work with images and videos.

LLMs: Challenges and Solutions

  • Heavy and Resource-Intensive: Running LLMs requires significant processing power, making them expensive.
  • API Services: Thankfully, there are services that provide LLMs as APIs, allowing us to leverage their capabilities without managing the infrastructure.

Implementing the AI Front Page Generator

For our AI-based front page generator, we’ll use OpenAI’s API. The steps are as follows:

  1. Step 1: Generate Title and Summary

    • After an order is placed, send the content of the print to OpenAI’s API with the instruction:
      "Read the content and give me a title and summary."
    • The API will return a suitable title and summary.
  2. Step 2: Generate an Image

    • Use the generated title and summary to call an image generator API to create an image.
  3. Step 3: Create a PDF

    • Combine the title, summary, and generated image into a PDF.
    • Add this PDF as a printable page to the print order.

This completes the AI Front Page Generator. With this, we’ve successfully introduced an advanced and practical use of AI to enhance the product.

Note: AI is a vast topic, and explaining it in detail requires more space. Hence, we’ve moved this explanation to a separate blog titled "Explain Like I’m 55: AI," which will cover the basics of AI, including LLMs, NLP, and their applications, in a simple and relatable way. Sign up for our newsletter to get notified when the blog is released!

The End

With investments secured and a working product, it was time to scale operations and aim to become the Swiggy of printing. But then, Blinkit and Zepto launched printing services in their catalog. Their competitive edge was undeniable—they had their own delivery network offering 10-minute deliveries, printers in dark stores for immediate fulfillment, better margins, and even the ability to bundle printing with other products for free delivery. They could even monetize further by printing ads for discounts, much like websites do. Faced with this competition, it became clear that our business model wasn’t sustainable.

While an optimistic perspective suggested that traditional printers might need us more than ever to survive the shift to online services, the numbers didn’t add up. After careful consideration, you decided to return the investments and shut things down before fully launching.

However, your developer friends had another idea—they didn’t want the hard work to go to waste. They suggested making the project open-source so others could build on it or find new uses for it. You also hosted a basic working version of the platform for anyone to opt into and use. It wasn’t the outcome you had envisioned, but it felt right to let the project live on in some way.

It was nice while it lasted.

Subscribe to the newsletter