If you’ve opened the SLOs Overview in the Google Cloud console recently, you may have seen this notice:
This notice announces a recent change in the way of defining services for Cloud Monitoring. Before the change, Cloud Monitoring automatically discovered services that were provisioned in AppEngine, Cloud Run or Google Kubernetes Engine (GKE). These services were automatically populated in the Services Overview dashboard.
Now, all services in the Services Overview dashboard have to be created explicitly. To simplify this task, when defining a new service in the console UI you are presented with a list of candidates that is built based on the auto-discovered services. The full list of the auto-discovered services includes managed services from AppEngine, Cloud Run and Istio as well as GKE workloads and services.
Besides using the UI, you can add managed services to Cloud Monitoring using the services.create API or using the Terraform google_monitoring_service resource.
For example, if you have a GKE cluster named cluster-001 provisioned in the us-central1 region that has a service frontend in the default namespace, the following command in Cloud Shell defines this service for Cloud Monitoring: code_block )])]>
When using the Terraform resource, the keys for the service_labels argument should be converted from the camel case notation (in documentation) to the snake case notation. For example, the command above will look in Terraform like the following: code_block )])]>
When your definition of the service does not match one to one with one of the managed services, you can add it to Cloud Monitoring by defining a custom service. You will use the same API request: code_block )])]>
Or you will use a designated Terraform resource, google_monitoring_custom_service: code_block )])]>
Compared to a custom service, the auto-detected services come with two predefined SLIs for availability and latency. These SLIs utilize the metrics of the managed services that are automatically captured such as request processing time or HTTP request status. For custom services these SLIs have to be defined explicitly using request-based or window-based SLIs.
Check out creating SLOs and SLO-based alerts to find more information about tracking your service SLO and error budgets. And see this blog to learn about the predefined SLIs that are used in availability and latency SLOs.
Have you ever wondered if there is a more automated way to copy Artifact Registry or Container Registry Images across different projects and Organizations? In this article we will go over an opinionated process of doing so using serverless components in Google Cloud and its deployment with Infrastructure as Code (IaC).
This article assumes knowledge of coding in Python, basic understanding of running commands in a terminal and the Hashicorp Configuration Language (HCL) i.e. Terraform for IaC.
In this use case we have at least one container image residing in an Artifact Registry Repository that has frequent updates to it, that needs to be propagated to external Artifact Registry Repositories inter-organizationally. Although the images are released to external organizations they should still be private and may not be available for public use.
To clearly articulate how this approach works, let’s first cover the individual components of the architecture and then tie them all together.
As discussed earlier, we have two Artifact Registry (AR) repositories in question; let’s call them “Source AR” (the AR where the image is periodically built and updated, the source of truth) and “Target AR” (AR in a different organization or project where the image needs to be consumed and propagated periodically) for ease going forward. The next component in the architecture is Cloud Pub/Sub; we need an Artifact Registry Pub/Sub topic in the source project that automatically captures updates made to the source AR. When the Artifact Registry API is enabled, Artifact Registry automatically creates this Pub/Sub topic; the topic is called “gcr” and is shared between Artifact Registry and Google Container Registry (if used). Artifact Registry publishes messages for the following changes to the topic:
Image uploads
New tags added to images
Image deletion
Although the topic is created for us, we will need to create a Pub/Sub subscription to consume the messages from the topic. This brings us to the next component of the architecture, Cloud Run. We will create a Cloud Run deployment that will perform the following:
Parse through the Pub/Sub messages
Compare the contents of the message to validate if the change in the Source AR warrants an update to the Target AR
If the validation conditions are met, then the Cloud Run service moves the latest Docker image to the Target AR
Now, let’s dive into how Cloud Run integrates with the Pub/Sub AR topic. For Cloud Run to be able to read the Pub/Sub messages we have two additional components; an EventArc trigger and a Pub/Sub subscription. The EventArc trigger is critical to the workflow as it is what triggers the Cloud Run service.
In addition to the components described above, the below prerequisites need to be met for the entire flow to function correctly.
Cloud SDK needs to be installed on the users’ terminal so that you can run gcloud commands.
The project Service Account (SA) will need “Read” permission on the Source AR.
The Project SA will need “Write” permission on the Target AR.
VPC-SC requirements on the destination organization (if enabled)
Egress Permissions to the target repository from the SA running the job
Ingress permission for the account running the ’make’ commands (instructions below) and writing to Artifact Registry or Container Registry
Ingress Permissions to read the PUB/SUB GCR Topic of the source repository
Allow [project-name]-sa@[project-name].iam.gserviceaccount.com needs VPC-SC Ingress for the Artifact Registry method
Allow [project-name]-sa@[project-name].iam.gserviceaccount.com needs VPC-SC Ingress for CloudRun method
var.gcp_project
Var.service_account Below we talk about the Python code, Dockerfile and the Terraform code which is all you need for implementing this yourself. We recommend that you open our Github repository while reading the below section where all the Open Source code for this solution lives. Here’s the link: https://github.com/GoogleCloudPlatform/devrel-demos/tree/main/devops/inter-org-artifacts-release
What we deploy in Cloud Run is a custom Docker container. It comprises of the following files:
App.py: This file contains the variables for the source and target containers as well as the execution code that will be triggered to run based on the Pub/Sub messages and contains the following Python code.
Copy_image.py: this file contains the copy command app.py will leverage in order to run the gcrane command required to copy images from source AR to target AR.
Dockerfile: This file contains the instructions needed to package gcrane and the requirements needed to build the Cloud Run image
Since we have now covered all of the individual components that are associated with this architecture, let’s walk through the flow that ties all the individual components together.
Let’s say your engineering team has built and released a new version of the Docker Image “Image X”, per their release schedule and added the “latest” tag to it. This new version is sitting in the Source AR and when the new version gets created, the AR Pub/Sub topic updates the message that reflects that a new version of the “Image X” has been added to the source AR. This automatically causes the EventArc trigger to poke the Cloud Run service to scrape the messages from the Pub/Sub subscription.
Our Cloud Run service will use the logic written in the App.py image to check if the action that happened in Source AR matches the criteria specified (Image X with tag “latest”). If the action matches and warrants a downstream action, Cloud Run triggers Copy_image.py to execute the gcrane command to copy the image name and tag from the Source AR to the Target AR.
In the event that the image or tag does not match the criteria specified in App.py, (for eg. Image Y tag: latest) the Cloud Run process will give back an HTTP 200 reply with a message “The source AR updates were not made to the [Image X]. No image will be updated.” confirming no action will be taken.
Note: Because the Source AR may contain multiple images and we are only concerned with updating specific images in the Target AR we have integrated output responses within the Cloud Run services that can be viewed in the Google Cloud logs for troubleshooting and diagnosing issues. This also prevents unwanted publishing of images not pertaining to the desired image(s) in question.
Why did we not go with an alternative approach? Versatility: The Source and Target AR’s were in different Organizations
Compatibility: The Artifacts were not in a Code/Git repository compatible with solutions like Cloud Build.
Security: VPC-SC perimeters limit the tools we can leverage while using cloud native serverless options.
Immutability: We wanted a solution that could be fully deployed with Infrastructure as Code.
Scalability and Portability: We wanted to be able to update multiple Artifact Registries in multiple Organizations simultaneously.
Efficiency and Automation: Avoids a time-based pull method when no resources are being moved. Avoids human interaction to ensure consistency.
Cloud Native: Alleviates the dependency on third-party tools or solutions like a CI/CD pipeline or a repository outside of the Google Cloud environment. If your Upstream projects where the images are coming from all reside in the same Google Cloud Region or Multi-region, a great alternative to solve the problem is Virtual repositories.
How do we deploy it with IaC?
We have provided the Terraform code we used to solve this problem.
The following variables will be used in the code. These variables will need to be replaced or declared within a .tfvars file and assigned a value based on the specific project.
var.gcp_project
Var.service_account In conclusion, there are multiple ways to bootstrap a process for releasing artifacts across Organizations. Each method would have its pros and cons, the best one for the approach would be determined by evaluating the use case at hand. The things to consider here would be, if the artifacts can reside in a Git repository, if the target repository is in the same Organization or a child Organization and if CI/CD tooling is preferred.
If you have gotten this far it’s likely you may have a good use case for this solution. This pattern can also be used for other similar use cases. Here are a couple examples just to get you started:
Copying other types of artifacts from AR repositories like Kubeflow Pipeline Templates (kfp)
Copying bucket objects behind a VPC-SC between projects or Orgs
Learn more
Our solution code can be found here: https://github.com/GoogleCloudPlatform/devrel-demos/tree/main/devops/inter-org-artifacts-release
GCrane: https://github.com/google/go-containerregistry/blob/main/cmd/gcrane/README.md Configuring Pub/Sub GCR notifications: https://cloud.google.com/artifact-registry/docs/configure-notifications
Ensuring application and service teams have the resources they need is crucial for platform administrators. Fleet team management features in Google Kubernetes Engine (GKE) make this easier, allowing each team to function as a separate “tenant” within a fleet. In conjunction with Config Sync, a GitOps service in GKE, platform administrators can streamline resource management for their teams across the fleet.
Specifically, with Config Sync team scopes, platform admins can define fleet-wide and team-specific cluster configurations such as resource quotas and network policies, allowing each application team to manage their own workloads within designated namespaces across clusters.
Let’s walk through a few scenarios.
Separating resources for frontend and backend teams
Let’s say you need to provision resources for frontend and backend teams, each requiring their own tenant space. Using team scopes and fleet namespaces, you can control which teams access specific namespaces on specific member clusters.
For example, the backend team might access their bookstore and shoestore namespaces on us-east-cluster and us-west-cluster clusters, while the frontend team has their frontend-a and frontend-b namespaces on all three member clusters.
Unlocking Dynamic Resource Provisioning with Config SyncYou can enable Config Sync by default at the fleet level using Terraform. Here’s a sample Terraform configuration: code_block )])]>
Note: Fleet defaults are only applied to new clusters created in the fleet.
This Terraform configuration enables Config Sync as a default fleet-level feature. It installs Config Sync and instructs it to fetch Kubernetes manifests from a Git repository (specifically, the “main” branch and the “fleet-tenancy/config” folder). This configuration automatically applies to all clusters subsequently created within the fleet. This approach offers a powerful way of configuring manifests across fleet clusters without the need for manual installation and configuration on individual clusters.
Now that you’ve configured Config Sync as a default fleet setting, you might want to sync specific Kubernetes resources to designated namespaces and clusters for each team. Integrating Config Sync with team scopes streamlines this process.
Setting team scope Following this example, let’s assume you want to apply a different network policy for the backend team compared to the frontend team. Fleet team management features simplify the process of provisioning and managing infrastructure resources for individual teams, treating each team as a separate “tenant” within the fleet.
To manage separate tenancy, as shown in the above team scope diagram, first set up team scopes for the backend and frontend teams. This involves defining fleet-level namespaces and adding fleet member clusters to each team scope.
Now, let’s dive into those Kubernetes manifests that Config Sync syncs into the clusters.
Applying team scope in Config SyncEach fleet namespace in the cluster is automatically labeled with fleet.gke.io/fleet-scope: . For example, the backend team scope contains the fleet namespaces bookstore and shoestore, both labeled with fleet.gke.io/fleet-scope: backend.
Config Sync’s NamespaceSelector utilizes this label to target specific namespaces within a team scope. Here’s the configuration for the backend team: code_block )])]>
Applying NetworkPolicies for the backend teamBy annotating resources with configmanagement.gke.io/namespace-selector: , they’re automatically applied to the right namespaces. Here’s the NetworkPolicy of the backend team: code_block )])]>
This NetworkPolicy is automatically provisioned in the backend team’s bookstore and shoestore namespaces, adapting to fleet changes like adding or removing namespaces and member clusters.
Extending the concept: ResourceQuotas for the frontend teamHere’s how a ResourceQuota is dynamically applied to the frontend team’s namespaces: code_block )])]>
Similarly, this ResourceQuota targets the frontend team’s frontend-a and frontend-b namespaces, dynamically adjusting as the fleet’s namespaces and member clusters evolve.
Delegating resource management with Config Sync: Empowering the backend teamTo allow the backend team to manage their own resources within their designated bookstore namespace, you can use Config Sync’s RepoSync, and a slightly different NamespaceSelector.
Targeting a specific fleet namespaceTo zero in on the backend team’s bookstore namespace, the following NamespaceSelector targets both the team scope and the namespace name by labels: code_block )])]>
Introducing RepoSyncAnother Config Sync feature is RepoSync, which lets you delegate resource management within a specific namespace. For security reasons, RepoSync has no default access; you must explicitly grant the necessary RBAC permissions to the namespace.
Leveraging the NamespaceSelector, the following RepoSync resource and its respective RoleBinding can be applied dynamically to all bookstore namespaces across the backend team’s member clusters. The RepoSync points it to a repository owned by the backend team: code_block )])]>
Note: The .spec.git section would reference the backend team’s repository.
The backend team’s repository contains a ConfigMap. Config Sync ensures that the ConfigMap is applied to the bookstore namespaces across all backend team’s member clusters, supporting a GitOps approach to management.
Easier cross-team resource management
Managing resources across multiple teams within a fleet of clusters can be complex. Google Cloud’s fleet team management features, combined with Config Sync, provide an effective solution to streamline this process.
In this blog, we explored a scenario with frontend and backend teams, each requiring their own tenant spaces and resources (NetworkPolicies, ResourceQuotas, RepoSync). Using Config Sync in conjunction with the fleet management features, we automated the provisioning of these resources, helping to ensure a consistent and scalable setup.
Next steps
Learn how to use Config Sync to sync Kubernetes resources to team scopes and namespaces.
To experiment with this setup, visit the example repository. Config Sync configuration settings are located within the config_sync block of the Terraform google_gke_hub_feature resource.
For simplicity, this example uses a public Git repository. To use a private repository, create a Secret in each cluster to store authentication credentials.
To learn more about Config Sync, see Config Sync overview.
To learn more about fleets, see Fleet management overview.
Is your team keeping up with the rapid pace of change? Are you able to meet your customers’ expectations while delivering business value and maintaining a healthy team? Take 15 minutes to complete the 2024 DORA Survey. Your participation provides insights for our ongoing research and gives you a moment to reflect on how your team is doing today.
For more than a decade, DORA has been researching the capabilities of technology-driven teams and the ways these teams improve key outcomes like:
Software delivery performance: How quickly and reliably teams can deliver software changes.
Organizational performance: How effectively organizations achieve their business goals.
Team well-being: How work practices and culture impact the well-being of software development teams.
DORA’s annual survey is now open and we encourage you and your team to participate now.
Key findings from 2023
DORA’s Accelerate State of DevOps Report 2023 included a number of key findings and insights that can help your team improve.
More than half of respondents were using AI for some technical tasks. We expect to see an increase this year!
Culture is foundational to building technical capabilities.
High-quality documentation leads to 25% higher team performance.
Teams with faster code reviews realize a 50% improvement in software delivery performance.
2024 and beyond
Broadly applicable
DORA’s research benefits from inputs for a wide variety of roles working in every industry, and provides insights to teams working on all types of applications and services.
Who should take the survey? We are interested in learning from anyone who is working on or with technology-driven teams. Here are a few roles that will provide valuable input for our research and can benefit from taking the survey.
Technology leaders
Product managers
Architects
Software developers
Quality engineers
Compliance and security professionals
Operators
Site reliability engineers
Platform engineers
User experience professionals
Release engineers
And more
The survey is anonymous and platform agnostic; we are trying to understand how you work to deliver great customer experiences while maintaining a healthy, productive team.
A moment of reflection
The 15 minutes you spend taking the survey can serve as a moment of reflection. Careful consideration of the survey’s questions often helps shine a light on areas that you and your team can improve. Some teams have found value in dedicating time for everyone to take the survey and then spend 20-30 minutes discussing the experience. Here are some questions that you might use with your team to facilitate a discussion.
Which questions were difficult to answer and why?
Which areas do you feel we could improve as a team?
Where do you feel we are excelling as a team?
What one thing would you like us to try as a team?
A look to the future
DORA has always tried to understand and evaluate the impact of emerging trends. This year is no different. The industry continues to see a rapid rate of change and there are some new technologies, topologies, and approaches that teams are prioritizing. This year we’re trying to learn from you about your efforts in three key areas:
Artificial Intelligence (AI) - Last year we asked the importance of AI in your daily work. This year we are expanding our inquiry in this area to better understand how AI is changing your work and the impact of those changes. How is AI impacting organizational performance?
Platform Engineering - We want to better understand how your organization is approaching platform engineering. Platform engineering may include both technologies and teams. How does platform engineering impact software delivery performance?
Developer Experience - What is your overall experience like as you work to deliver for your customers? How does your experience impact the value you’re able to deliver?
DORA helps teams like yours
Insights from DORA’s research and adopting the practices that predict improved performance have helped technology-driven teams in organizations around the world. ANZ, Sabre, and wayfair are just three examples. “To successfully supercharge our DevOps evolution, ANZx decided to adopt the … practices found in the DORA research program.” “Between the technology and DORA research principles Google Cloud helped us introduce, we saw notable improvements in the five essential characteristics of cloud computing.” “Mostly we’re using the DORA metrics. broad metrics that really take effect over the broad developer ecosystem are we, in fact, making things faster, better?” Your voice matters
Your anonymous input is an important part of our research process and helps shape our collective understanding of the state of technology-driven teams. Set aside 15 minutes to take the survey, reflect on the experience, and take the next step in your journey of continuous improvement. This year’s survey is available in English, Español, Français, Português, 日本語, and 简体中文. Don’t delay, the survey will be closing soon.
Google Cloud Champion Innovators are a global network of more than 600 non-Google professionals, who are technical experts in Google Cloud products and services. Each Champion specializes in one of nine different technical categories which are cloud AI/ML, data analytics, hybrid multi-cloud, modern architecture, security and networking, serverless app development, storage, Workspace and databases.
In this interview series we sit down with Champion Innovators across the world to learn more about their journeys, their technology focus, and what excites them.
Today we’re talking to Juan Guillermo Gómez. Currently Technical Lead at Wordbox, Juan Guillermo is a Cloud Architect, Google Developer Expert, Serverless App Development Champion, and community builder who is regularly invited to share his thoughts on software architecture, entrepreneurship, and innovation at events across Latin America.
Natalie Tack (Google Cloud Editorial): What technology area are you most fascinated with, and why?
Juan Guillermo Gómez: As a kid, I dreamed of developing software that would change the world. I’ve always been fascinated by programming and the idea of creating products and services that make people’s lives easier. Nowadays, my focus is on architecture modernization and using innovative new tech to scale and improve systems. Following a Google Cloud hackathon around 10 years ago, I started to take an interest in serverless architecture and became really enthusiastic about Google Cloud serverless computing services, which enable you to focus on coding without worrying about infrastructure. Nowadays, I’m excited about the potential of AI to help us create better, more robust, and more efficient systems, and I’m really looking forward to seeing where that will take us.
NT: As a developer, what’s the best way to go about learning new things?
JGG: There’s a wealth of resources out there, whether it’s YouTube, podcasts or developer blogs. I find the Google Cloud developer blog and YouTube channel particularly instructive when it comes to real use cases. But the most important thing in my opinion is to be part of a community. It enables you to share experiences, collaborate on projects and learn from others with expertise in specific industries. Google Developer Groups, or GDGs, for example are a great way to network and keep up with the latest developments. Becoming a Google Cloud Champion Innovator has been really beneficial. It enables me to learn directly from Googlers, collaborate around shared problems, and work directly with other people from my field. I can then share that knowledge with my community in Latin America, both as co-organizer of GDG Cali and via my podcast Snippets Tech.
NT: Could you tell us more about your experience as an Innovator?
JGG: I joined the Innovators program in September 2022, and it was a natural fit from the start. The Innovator culture is deeply rooted in that of developer communities, which I’ve participated in for over 20 years now. The core philosophy is sharing, collaboration and learning from others’ experiences, not only through getting together at talks and conferences, but also open source software and libraries. Basically, creating things that benefit the community at large: the Innovator mindset is fundamentally collaborative.
As a result, the program provides a wealth of information, and one of the biggest benefits from my point of view is the amount of real use cases it gives access to. I’ve learnt a lot about embedding and semantic similarity that I’m putting into practice in my work as Technical Lead at Wordbox, a startup that helps people learn languages through TV and music.
NT: What are some upcoming trends you’re enthusiastic about?
JGG: Generally speaking, I’m very interested to see where we’ll go with generative AI. I started working with Google Cloud APIs such as Translate and Speech-to-Text around three years ago as part of my work with Wordbox, and I’m impressed with the way Google Cloud democratizes machine learning and AI, allowing anyone to work with it without extensive machine learning knowledge.
There are so many potential use cases for gen AI. As a developer, you can work faster, solve programming language challenges, write unit tests, and refer to best practices. As an Innovator, I gained early access to Duet AI for Developers (now called Gemini Code Assist), which is a great tool to fill in knowledge gaps, suggest scripts and help out junior architects or those that are new to Google Cloud - basically helping you focus on creating great code.
NT: Can you tell us about an exciting new project you’re working on?
JGG: The Wordbox language learning app features a collection of short videos with English phrases that we show users based on semantic similarity, where the phrase in the new video is similar to the previous one. To enable that, we use Vertex AI, PaLM 2 and Vector Search, and are keen to explore Gemini models, as they offer advanced capabilities for comparing semantic similarities not only between text, but also between text and video, which would enable us to create stories around specific phrases. For example, if a user is watching a video related to a “Game of Thrones” review series and learning certain expressions from it, we can use Gemini models to find similarities with other videos or texts. This will allow us to produce a narrative around the learned expression, creating a comprehensive learning environment that’s tailored to the user’s proficiency level. The learner can then read or watch the story, answer questions and engage in a more interactive, personalized learning experience.
As a side project, I’m also working on an AI-enabled platform that helps musicians and singers create lyrics based on keywords, genre, and context. They input the information, and the platform generates lyrics that hopefully serve as a jumping-off point for a great new piece of music.
NT: What advice would you give to budding innovators?
JGG: The Innovators program can be summed up in three words: networking, learning, and growth. I’d advise anyone interested in becoming an Innovator to be proactive both in learning and in engaging with their fellow developers. It feels pretty great to be given early access to an API and then a few months later tell your community about it while creating fantastic new features for your clients.
Take the next steps on your Google Cloud journey and learn more about the Google Cloud Innovators Program, designed to help developers and practitioners grow their Google Cloud skills and advance their careers. No matter where you are on your cloud journey, the Innovators Program has something for you!
At Ninja Van, running applications in a stable, scalable environment is business-critical. We are a fast-growing tech-enabled express logistics business with operations across South East Asia. In the past 12 months, we’ve partnered with close to two million businesses to deliver around two million parcels daily to customers, with the help of more than 40,000 workers and employees.
We are an established customer of Google Cloud, and continue to work closely with them to expand into new products and markets, including supply chain management and last mile courier services.
Deploying a secure, stable, container platform
To run the applications and microservices architecture that enable our core businesses, we opted to deploy a secure and stable container platform. We had been early adopters of containerization. We were initially running our container workload in CoreOS with a custom scheduler, Fleet. As we monitor the activities in the open source community, it was evident Kubernetes was gaining more traction and is becoming more and more popular. We decided to run our own Kubernetes cluster and API server, later deploying a number of Compute Engine virtual machines, consisting of both control plane and worker nodes.
As we dived deeper into Kubernetes, we realized that a lot of what we did was already baked into its core functionalities, such as service discovery. This feature enables applications and microservices to communicate with each other without the need to know where the container is deployed to among the worker nodes. We felt that if we continued maintaining our discovery system, we would just be reinventing the wheel. Thus, we dropped what we had in favor of this Kubernetes core feature.
We also found the opportunity to engage with and contribute to the open source community working on Kubernetes compelling. With Kubernetes being open standards-based, we gained the freedom and flexibility to adapt our technology stack to our needs.
However, we found upgrading a self-managed Kubernetes challenging and troublesome in the early days and decided to move to a fully managed Kubernetes service. Among other benefits, we could upgrade Kubernetes easily through Google Kubernetes Engine (GKE) user interface or Google Cloud command line tool. We could also enable auto upgrades simply by specifying an appropriate release channel.
Simplifying the operation of multiple GKE clusters
With a technology stack based on open-source technologies, we have simplified the operation of multiple clusters in GKE. For monitoring and logging operations, we have moved to a centralized architecture to colocate the technology stack on a single management GKE cluster. We use Elasticsearch, Fluentbit and Kibana for logging and Thanos, Prometheus and Grafana for monitoring. When we first started, logging and monitoring were distributed across individual clusters, which meant that administrators had to access the clusters separately which led to a lot of operational inefficiencies. This also meant maintaining duplicated charts across different instances of Kibana and Grafana.
For CI/CD, we run a dedicated DevOps GKE cluster used for hosting developer toolsets and running the pipelines. We embrace the Atlassian suite of services, such as JIRA, Confluence, Bitbucket, Bamboo to name a few. These applications are hosted in the same DevOps GKE cluster.
Our CI/CD is centralized, but custom steps can be put in, giving some autonomy to teams. When a team pushes out new versions of codes, our CI pipeline undertakes the test and build processes. This then produces container image artifacts for backend services and minified artifacts for frontend websites. Typically, the pipelines are parameterized to push out and test the codes in development and testing environments, and subsequently deploy them into sandbox and production. Depending on their requirements, application teams have the flexibility to opt for a fully automated pipeline or a semi-automated one, that requires manual approval for deployment to production.
Autoscaling also comes into play during runtime. As more requests come in, we need to scale up to more containers, but shrink down automatically as the number of requests decreases. To support autoscaling based on our metrics, we integrate KEDA to our technology stack.
Delivering a seamless development experience
In a sense, our developers are our customers as well as our colleagues. We aim to provide a frictionless experience for them by automating the testing, deployment, management and operation of application services and infrastructure. By doing this, we allow them to focus on building the application they’re assigned.
To do this, we’re using DevOps pipelines to automate and simplify infrastructure provisioning and software testing and release cycles. Teams can also self-serve and spin up GKE clusters with the latest environment builds mirroring production with non-production sample data preloaded, which gives them a realistic environment to test the latest fixes and releases. As a build proceeds to deployment, they can visit a Bamboo CI/CD Console to track progress.
Code quality is critical to our development process. An engineering excellence sub-vertical within our business monitors code quality through part of our CI/CD using the SonarQube open-source code inspection tool. We set stringent unit test coverage percentage requirements and do not allow anything into our environment that fails unit or integration testing.
We release almost once a day excluding code freezes on days like Black Friday or Cyber Monday, when we only release bug fixes or features that need to be deployed urgently during demand peaks. While we’ve not changed our release schedule with GKE, we’ve been able to undertake concurrent builds in parallel environments, enabling us to effectively manage releases across our 200-microservice architecture.
Latency key to Google Cloud and GKE selection
When we moved from other cloud providers to Google Cloud and GKE, the extremely low latency between Google Cloud data centers, in combination with reliability, scalability and competitive pricing, gave us confidence Google Cloud was the right choice. In a scenario with a lot of people using the website, we can provide a fast response time and a better customer experience.
In addition, Google Cloud makes the patching and upgrading of multiple GKE clusters a breeze by automating the upgrade process and proactively informing us when an automated upgrade or change threatens to break one of our APIs.
Google Cloud and GKE also open up a range of technical opportunities for our business, including simplification. Many of our services use persistent storage, and GKE provides a simple way to automatically deploy and manage them through their Container Storage Interface (CSI) driver. The CSI driver enables native volume snapshots and in conjunction with Kubernetes CronJob, where we can easily take automated backups of the disks running services such as TiDB, MySQL, Elasticsearch and Kafka. On the development front, we are also exploring Skaffold, which opens up possibilities for development teams to improve their inner development loop cycle and develop more effectively as if their local running instance is deployed within a Kubernetes cluster.
Overall, the ease of management of GKE means we can manage our 200-microservice architecture in 15 clusters with just 10 engineers even as the business continues to grow.
If you want to try this out yourself, check out a hands-on tutorial where you will set up a continuous delivery pipeline in GKE that automatically rebuilds, retests, and redeploys changes to a sample application.
Introducing App Hub
Cloud applications depend on a complex interaction of services and cloud resources that they depend on — think databases, load balancers, and Kubernetes clusters. Developers spend significant time and effort navigating and classifying these to understand their composition and dependencies. This can be not only time-consuming, but fragile. While you may start with a well-defined hierarchy, applications and services evolve but their resources tend to remain locked within this predetermined structure. Further complicating matters, applications frequently span project and folder boundaries.
Today, we’re introducing App Hub, which simplifies cloud application deployment and management by providing an accurate representation of deployed applications — one that understands all resource dependencies, regardless of the specific Google Cloud products they use. This view adapts with the reality of the deployment and is maintained for the user, so it is always up to date. With this view, the user always understands the state of deployments and their attributes. In addition, through a deep integration with Gemini Cloud Assist, App Hub helps you to understand deployed applications with deep insights and recommendations on how to resolve incidents.
App Hub is generally available today.
What’s in App Hub
At its core, App Hub provides a uniform model that can ingest different types of infrastructure resources from multiple Google Cloud projects, and abstracts them into standardized logical constructs of Services and Workloads. These Services and Workloads can be grouped into Applications that perform your end-to-end business functionality.
In its current release, App Hub can be used to create applications from Regional Load Balancers and Compute Engine managed instance groups (MIGs), with plans to soon expand to Google Kubernetes Engine (GKE), Cloud Run, Cloud Service Mesh and Google Cloud managed services such as databases or data analytics products. App Hub lets you use APIs, the gcloud command-line interface, a graphical user interface, or Terraform scripts to create applications, define their attributes, and integrate with any existing homegrown or third-party tools that you use for inventorying, incident reporting, ticketing systems, etc. App Hub can organize your cross-project Google Cloud resources into Applications
App Hub provides the following capabilities:
Organize and categorize your applications: App Hub lets you organize and categorize applications using attributes such as Owner, Criticality, and Environment. This makes it easy to find and manage specific applications and their associated resources.
Understand resources in your application: App Hub helps you understand the composition of these applications. This can help developers and operators to understand how applications work and their dependencies.
Getting started with App Hub
App Hub operates around the concept of Applications. To get started, follow three simple steps:
Define administrative boundaries - A Host Project is how you can define the administrative boundary. All applications in projects associated with a single Host Project are part of the same administrative boundary. This means that the applications in all the projects associated with a Host Project share a common administrator or set of administrators that are allowed to register services and workloads as part of the application, update and delete applications. You can separate the administration by having multiple Host Projects in the organization. For example, you can create a Host Project for each business unit (e.g. Retail, Online) or one Host Project for each business function (e.g., Finance, Sales). Click here to learn more about how to enable App Hub and create Host Projects.
Discover infrastructure resources - You don’t need any additional instrumentation to discover the resources from a group of projects; once a project is added to the Host Project, App Hub automatically discovers its infrastructure resources in them as Services and Workloads. App Hub can then help you group these Services and Workloads together and organize them into Applications. You can then organize Applications manually through UI, CLI and API or automatically through Terraform. Click here to learn more about how to register services and workloads into applications.
Define business attributes - To help better organize your Applications, Services and Workloads, you can then define business attributes such as Ownership, Criticality and Environment. Click here to learn more about how to define attributes in App Hub. You can now track, monitor and manage applications instead of the discrete underlying resources.
Next steps with App Hub
App Hub offers powerful application management today, but there is more to come. Our vision is for App Hub to seamlessly manage your applications against your business goals, minimizing the need to wrestle with complex infrastructure. For example, App Hub will help manage incidents efficiently, apply security controls, check overall health, manage quota adjustments, and holistically manage all the resources in your application across a wide variety of scenarios.
App Hub APIs are now Generally Available (GA) and offered free of charge to all customers. For more information, visit our website. To get started and create your App Hub Applications, please follow the setup instructions.
Everywhere you look, there is an undeniable excitement about AI. We are thrilled, but not surprised, to see Google Cloud’s managed containers taking a pivotal role in this world-shaping transformation. We’re pleased to share announcements from across our container platform that will help you accelerate AI application development and increase AI workload efficiency so you can take full advantage of the promise of AI, while also helping you continue to drive your modernization efforts.
The opportunity: AI and containers
AI visionaries are pushing the boundaries of what’s possible, and platform builders are making those visions a scalable reality. The path to success builds on your existing expertise and infrastructure, not throwing away what you know. These new workloads demand a lot out of their platforms in the areas of:
Velocity: with leaders moving rapidly from talking about AI to deploying AI, time to market is more important than ever.
Scale: many of today’s systems were designed with specific scalability challenges in mind. Previous assumptions, no matter if you are a large model builder or looking to tune a small model for your specific business needs, have changed significantly.
Efficiency goals: AI’s inherent fluidity — such as shifting model sizes and evolving hardware needs — is changing how teams think about the cost and performance of both training and serving. Companies need to plan and measure at granular levels, tracking the cost per token instead of cost per VM. Teams that are able to measure and speak this new language are leading the market.
Containers serve the unique needs of AI
We’ve poured years of our insights and best practices into Google Cloud’s managed container platform and it has risen to the occasion of disruptive technology leaps of the past. And considering the aforementioned needs of AI workloads, the platform’s offerings — Cloud Run and Google Kubernetes Engine (GKE) — are ideally situated to meet the AI opportunity because they can:
Abstract infrastructure away: As infrastructure has changed, from compute to GPU time-sharing to TPUs, containers have allowed teams to take advantage of new capabilities on their existing platforms.
Orchestrate workloads: Much has changed from containers’ early days of being only used for running stateless workloads. Today, containers are optimized for a wide variety of workloads with complexity hidden from both users and platform builders. At Google, we use GKE for our own breakthrough AI products like Vertex AI, and to unlock the next generation of AI innovation with Deepmind.
Support extensibility: Kubernetes’ extensibility has been critical to its success, allowing a rich ecosystem to flourish, supporting user choice and enabling continued innovation. These characteristics now support the rapid pace of innovation and flexibility that users need in the AI era.
Cloud Run and GKE power Google products, as well as a growing roster of leading AI companies including Anthropic, Assembly AI, Cohere, and Salesforce that are choosing our container platform for its reliability, security, and scalability.
Our managed container platform provides three distinct approaches to help you move to implementation:
Solutions to get AI projects running quickly;
The ability to deploy customer AI workloads on GKE; and
Streamlined day-two operations across any of your enterprise deployments.
Cloud Run for an easy AI starting point
Cloud Run has always been a great solution for getting started quickly, offloading operational burden from your platform team and giving developers scalable, easy-to-deploy resources — without sacrificing enterprise-grade security or visibility.
Today, we are pleased to announce Cloud Run application canvas, designed to generate, modify and deploy Cloud Run applications. We’ve added integrations to services such as Vertex AI, simplifying the process of consuming Vertex AI generative APIs from Cloud Run services in just a few clicks. There are also integrations for Firestore, Memorystore, and Cloud SQL, as well as load balancing. And we’ve taken the experience one step further and integrated Gemini Cloud Assist, which provides AI assistance to help cloud teams design, operate, and optimize application lifecycles. Gemini in Cloud Run application canvas lets you describe the type of application you want to deploy with natural language, and Cloud Run will create or update those resources in a few minutes. Cloud Run’s application canvas showcasing a gen AI application
The velocity, scale, and efficiency you get from Cloud Run makes it a great option for building AI workloads. To help you get AI applications to market even faster, we’re pleased to announce Cloud Run support for integration with LangChain, a powerful open-source framework for building LLM-based applications. This support makes Cloud Run the easiest way to deploy and scale LangChain apps, with a developer-friendly experience.
“We researched alternatives, and Cloud Run is the easiest and fastest way to get your app running in production." - Nuno Campos, founding engineer, LangChain Creating and deploying a LangChain application to Cloud Run
GKE for training and inference
For customers who value an open, portable, cloud-native, and customizable platform for their AI workloads, GKE is ideal. The tremendous growth in AI adoption continues to be reflected in how customers are using our products : Over the last year, the use of GPUs and TPUs on Google Kubernetes Engine has grown more than 900%.
To better meet the needs of customers transforming their businesses with AI, we’ve built innovations that let you train and serve the very largest AI workloads, cost effectively and seamlessly. Let’s dive into each of those three: scale, cost efficiency, and ease of use.
Large-scale AI workloadsMany recent AI models demonstrate impressive capabilities, thanks in part to their very large size. As your AI models become larger, you need a platform that’s built to handle training and serving massive AI models. We continue to push the limits of accelerator-optimized hardware to make GKE an ideal home for your large-scale AI models:
Cloud TPU v5p, which we announced in December and is now generally available, is our most powerful and scalable TPU accelerator to date. By leveraging TPU v5p on GKE, Google Cloud customer, Lightricks has achieved a remarkable 2.5X speedup in training their text-to-image and text-to-video models compared to TPU v4.
A3 Mega, which we announced today, is powered by NVIDIA’s H100 GPUs and provides 2x more GPU to GPU networking bandwidth than A3, accelerating the time to train the largest AI models with GKE. A3 Mega will be generally available in the coming weeks.
Training the largest AI models often requires scaling far beyond a physical TPU. To enable continued scaling, last year we announced multi-slice training on GKE, which is generally available, enabling full-stack, cost-effective, large-scale training with near-linear scaling up to tens of thousands of TPU chips. We demonstrated this capability by training a single AI model using over 50,000 TPU v5e chips while maintaining near-ideal scaling performance.
Cost-efficient AI workloadsAs AI models continue to grow, customers face many challenges to scaling in a cost effective way. For example, AI container images can be massive, causing cold start times to balloon. Keeping AI inference latency low requires overprovisioning to handle unpredictable load, but slow cold-start times require compensating by overprovisioning even more. All of this creates under-utilization and unnecessary costs.
GKE now supports container and model preloading, which accelerates workload cold start — enabling you to improve GPU utilization and save money while keeping AI inference latency low. When creating a GKE node pool, you can now preload a container image or model data in new nodes to achieve much faster workload deployment, autoscaling, and recovery from disruptions like maintenance events. Vertex AI’s prediction service, which is built on GKE, found container preloading resulted in much faster container startup:
“Within Vertex AI’s prediction service, some of our container images can be quite large. After we enabled GKE container image preloading, our 16GB container images were pulled up to 29x faster in our tests.” – Shawn Ma, Software Engineer, Vertex AI
For AI workloads that have highly variable demand such as low-volume inference or notebooks, a GPU may sit idle much of the time. To help you run more workloads on the same GPU, GKE now supports GPU sharing with NVIDIA Multi-Process Service (MPS). MPS enables concurrent processing on a single GPU, which can improve GPU efficiency for workloads with low GPU resource usage, reducing your costs.
To maximize the cost efficiency of AI accelerators during model training, it’s important to minimize the time an application is waiting to fetch data. To achieve this, GKE supports GCS FUSE read caching, which is now generally available. GCS FUSE read caching uses a local directory as a cache to accelerate repeat reads for small and random I/Os, increasing GPU and TPU utilization by loading your data faster. This reduces the time to train a model and delivers up to 11x more throughput.
Ease of use for AI workloadsWith GKE, we believe achieving AI scale and cost efficiency shouldn’t be difficult. GKE makes obtaining GPUs for AI training workloads easy by using Dynamic Workload Scheduler, which has been transformative for customers like Two Sigma:
“Dynamic Workload Scheduler improved on-demand GPU obtainability by 80%, accelerating experiment iteration for our researchers. Leveraging the built-in Kueue and GKE integration, we were able to take advantage of new GPU capacity in Dynamic Workload Scheduler quickly and save months of development work.” – Alex Hays, Software Engineer, Two Sigma
For customers who want Kubernetes with a fully managed mode of operation, GKE Autopilot now supports NVIDIA H100 GPUs, TPUs, reservations, and Compute Engine committed use discounts (CUDs).
Traditionally, using a GPU required installing and maintaining the GPU driver on each node. However, GKE can now automatically install and maintain GPU drivers, making GPUs easier to use than ever before.
The enterprise platform for Day Two and beyond
Google Cloud’s managed container platform helps builders get started and scale up AI workloads. But while AI workloads are a strategic priority, there remains critical management and operations work in any enterprise environment. That’s why we continue to launch innovative capabilities that support all modern enterprise workloads.
This starts with embedding AI directly into our cloud. Gemini Cloud Assist helps you boost Day-two operations by:
Optimizing costs: Gemini will help you identify and address dev/test environments left running, forgotten clusters from experiments, and clusters with excess resources.
Troubleshooting: get a natural language interpretation of the logs in Cloud Logging.
Synthetic Monitoring: using natural language, you can now describe the target and user journey flows that you’d like to test, and Gemini will generate a custom test script that you can deploy or configure further based on your needs.
And it’s not just Day-two operations, Gemini Cloud Assist can help you deploy three-tier architecture apps, understand Terraform scripts and more, drastically simplifying design and deployment.
While AI presents a thrilling new frontier, we have not lost focus on the crucial elements of a container platform that serves modern enterprises. We’ve continued to invest in foundational areas that ensure the stability, security, and compliance of your cloud-native applications and were excited to introduce the following preview launches:
GKE threat detection, which identifies common container runtime attacks, analyzes suspicious code, and even uses natural language processing to pinpoint malicious scripts. And this is all integrated with Security Command Center for a comprehensive, cohesive approach to security.
GKE compliance, a fully managed compliance service that automatically delivers end-to-end coverage from the cluster to the container, scanning for compliance against the most important benchmarks. Near-real-time insights are always available in a centralized dashboard and we produce compliance reports automatically for you. This recording shows: 1) the GKE security posture dashboard, 2) clicking on the threat detection panel, and 3) getting details about a detected threat (creation of a pod with privileged containers). In the second part of the recording, we 4)navigate to the compliance dashboard where we see comprehensive compliance assessments for industry standards, then 5) we click on the concerns tab, where we see detailed reporting by each standard. 6) Finally we see details on the compliance constraints (checks) that failed (in this case, privilege escalation) and recommended remediation.
Let’s get to work
The urgency of the AI moment is permeating every aspect of technology, and data scientists, researchers, engineers, and developers are looking to platform builders to put the right resources in their hands. We’re ready to play our part in your success, delivering scalable, efficient, and secure container resources that fit seamlessly into your existing enterprise. We’re giving you three ways to get started:
For building your first AI application with Google Cloud, try Cloud Run and Vertex AI.
To learn how to serve an AI model, get started serving Gemma, Google’s family of lightweight open models, using Hugging Face TGI, vLLM, or TensorRT-LLM.
If you’re ready to try GKE AI with a Retrieval Augmented Generation (RAG) pattern with an open model or AI ecosystem integrations such as Ray, try GKE Quick Start Solutions.
Google Cloud has been steadfast in its commitment to being the best place to run containerized workloads since the 2015 launch of Google Container Engine. 2024 marks a milestone for open source Kubernetes, which celebrates its 10th anniversary in June. We’d like to give kudos to the community that has powered its immense success. According to The Cloud Native Computing Foundation (CNCF), the project now boasts over 314,000 code commits, by over 74,000 contributors. The number of organizations contributing code has also grown from one to over 7,800 in the last 10 years. These contributions, as well as the enterprise scale, operational capability, and accessibility offered by Google Cloud’s managed container platform, have constantly expanded the usefulness of containers and Kubernetes for large numbers of organizations. We’re excited to work with you as you build for the next decade of AI and beyond!
Google Cloud Next ‘24 is coming to Las Vegas and this year’s offerings are packed with exciting experiences for cloud engineers of every type. In particular, cloud architects have a lot to choose from — Spotlights, Showcases, Breakouts, and the Innovator’s Hive. Hot topics include networks, storage, distributed cloud and of course, AI. You can view the entire session library here, but here are a few that you should be sure to check out:
Spotlight SPTL204 AI and modernization on your terms: from edge to sovereign to cross-cloud: Join VP/GM of Infrastructure Sachin Gupta to learn about leveraging AI in public cloud, edge, and sovereign cloud use cases.
Spotlight SPTL205 Modern cloud computing: workload-optimized and AI-powered infrastructure: Here, VP/GM Compute ML Mark Lohmeyer shows how to build, scale modern AI workloads and use AI to optimize existing infrastructure.
Showcase Demo INFRA-104 Simplify and secure cloud networks with Cross-Cloud Network: Come to this Infrastructure Showcase to see how Cross-Cloud Networking can simplify your architecture.
Breakout ARC215 AI anywhere: How to build Generative AI Applications On-Premises with Google Distributed Cloud: Join this breakout to learn how to deploy and operate generative AI applications on-premises.
Innovator’s Hive Lightning Talk IHLT103 7 tips and tools to choose the Right Cloud Regions for your AI Workloads: This Lightning Talk will give you tips on how to deploy your AI workloads right the first time.
Breakout ARC218 Accelerate AI inference workloads with Google Cloud TPUs and GPUs: In this breakout session, learn the finer points of scaling your inference workloads using TPUs and GPUs.
Breakout ARC108 Increase AI productivity with Google’s AI Hypercomputer: Learn the latest about Google’s new AI Hypercomputer
Breakout ARC208 What’s new in cloud networking: All the latest from Cloud Networking with VP/GM Muninder Sambi and team
Breakout ARC204 How to optimize block storage for any workload with the latest from Hyperdisk: Learn how to accelerate your AI inference workloads with optimized block storage using Hyperdisk.
But these are just a handful of the amazing sessions we’re offering this year — be sure to add them to your personal agenda. And if you’re a networking person, be sure to check out my networking session recommendations. See you at the show!
As IT admins or architects, you have your work cut out for you. You need infrastructure that’s fast, secure, cost-effective, and ready for everything from AI to analytics to large enterprise apps. And that’s just your day job. You also face a constantly evolving IT environment where emerging technologies like generative AI force you to adapt and learn to stay ahead of the curve.
If this sounds familiar, Google Cloud Next ’24 is exactly what you need to learn about the latest cloud secrets, systems and silicon.
“Really?” you ask in a discernibly weary, sarcastic tone.
Really. This year the event is in Las Vegas on April 9-11, with a broad agenda including keynotes, ‘how-to’ deep-dives, panels, demos and labs. You can view the entire session library here, but if you’re still on the fence, let’s power through five questions that might be on your mind right now, and how you can get answers to them at Next ‘24.
1. How can I reduce costs and evaluate the reliability of cloud providers?
Moving and grooving on the cloud doesn’t need to be a headache. And not all providers are the same. You can save big with easy-to-use strategies, tools and flexible infrastructure. Here are a few sessions to explore:
The reality of reliability: Big differences between hyperscalers, explained - Get an objective assessment of outage/incident data from three major cloud providers, then learn how you can improve your operational reliability.
Apps take flight: Migrating to Google Cloud - Learn modern migration strategies and how Google simplifies the transition from on-prem or other clouds.
Optimize costs and efficiency with new compute operations solutions - Discover features and automations that streamline workload management; hear the latest product announcements and roadmap.
2. What are the opportunities, risks, and best practices when modernizing enterprise apps like VMware and SAP?
Cloud-native workloads are great and all, but enterprise workloads are the lifeblood of most organizations. Deciding if, when, and how to modernize them is challenging. At Next ‘24, we will share a ton of ideas and help you and assess the trade-offs for yourself:
Transform your VMware estate in the cloud without re-factoring or re-skilling - Explore how Google Cloud VMware Engine makes modernizing existing VMware setups fast and smooth.
Storage solutions optimized for modern and enterprise workloads - Find the perfect cloud-based file storage to balance your workload’s performance, availability, and cost needs.
Transform your SAP workload with Google Cloud - Optimize SAP with Google Cloud’s reliable infrastructure, tools, and best practices).
3. How should I architect my infrastructure for AI?
Tackling AI projects requires performance, scalability and know-how. Google Cloud’s AI Hypercomputer, the culmination of a decade of pioneering AI infrastructure advancements, helps businesses with performance-optimized hardware, open software, leading ML frameworks, and flexible consumption models. In fact, we were just named a Leader in The Forrester Wave™: AI Infrastructure Solutions, Q1 2024, with the highest scores of any vendor evaluated in both the Current Offering and Strategy categories.
Here are a few sessions where you can learn about our AI infrastructure during Next ‘24:
Workload-optimized and AI-powered infrastructure - Hear the latest product announcements from VP/GM Compute & ML, Mark Lohmeyer.
Increase AI productivity with Google Cloud’s AI Hypercomputer - A 101 on our AI supercomputing architecture.
How to get easy and affordable access to GPUs for AI/ML workloads - Get tips on making the most of GPU access for your AI/ML projects.
4. What’s the best way to build and run container-based apps?
Using managed services and choosing the right database will help you secure, scalable container-based applications. Google Cloud’s container offerings are developer favorites, packaging more than a decade’s worth of experience launching several billion containers per week. Here are a few sessions you should check out during Next ‘24:
The ultimate hybrid example: A fireside chat about how Google Cloud powers (part of) Alphabet - See how Google itself uses GKE, Cloud Run, and more to power its own services!
A primer on data on Kubernetes - Explore the rise of Kubernetes for data management, including AI/ML and storage best practices.
5. Can you meet my sovereignty, scalability and control requirements?
Many customers face challenges in adopting public cloud services due to policies and regulations that affect connectivity, reliability, data volumes, and security. Google Distributed Cloud offers solutions for government, regulated industries, telecom, manufacturing or retail (check out McDonald’s recent experience). Here are a few sessions where you can learn more:
Google Distributed Cloud’s 2024 Roadmap - Learn the basics and get a summary of new features, plus our roadmap.
How to build on-premises compliance controls for AI in regulated industries - Explore solutions to meet your toughest data sovereignty, security, and compliance requirements.
Deliver modern AI-enabled retail experiences for one or thousands of locations - Learn how to simplify edge configuration management at the edge, enabling store analytics, fast check-out, and predictive analytics, and more.
6. I just want to see something cool. What’s the one session I should attend?
You probably want to know how we’re embedding generative AI into Google Cloud solutions. We can’t reveal much yet, but make sure you add this session to your schedule: Transform your cloud operations and design capability with Duet AI for Google Cloud
Any of those questions ring a bell? We’ve got lots of answers, and we’re ready to share them at Next ‘24 starting on April 9th.
Not registered yet? Don’t delay! Space is limited.