Nail your next Google Cloud Platform interview with our extensive collection of scenario-based questions on Compute Engine, GKE, Cloud Storage, BigQuery, VPC, and architecture.
Explore GCP TrackMaster your concepts with 50 hand-picked questions
Google Compute Engine is the Infrastructure as a Service (IaaS) component of Google Cloud. It allows users to launch virtual machines on demand, drawing on Google's massive infrastructure, providing high-performance, scalable compute resources.
Google Cloud Storage is an enterprise-ready, fully managed RESTful object storage service. It is designed for secure and durable storage of unstructured data like images, backups, and static website configurations.
Primitive roles (Owner, Editor, Viewer) grant broad permissions across an entire project. Predefined roles are much more granular access controls defined by Google, granting specific permissions for specific services (e.g., 'Storage Object Viewer'), adhering better to the principle of least privilege.
Google App Engine is a fully managed, serverless Platform as a Service (PaaS) offering on GCP. Developers can deploy code in supported languages, and App Engine automatically handles infrastructure provisioning, load balancing, and scaling.
Unlike AWS or Azure where VPCs are strictly regional, a GCP Virtual Private Cloud (VPC) is a global routing resource. A single VPC can span multiple regions worldwide, allowing resources in different parts of the globe to communicate privately on the same network.
Cloud SQL is a fully managed relational database service for MySQL, PostgreSQL, and SQL Server, best for regional workloads. Cloud Spanner is a fully managed relational database with massive horizontal scale, strong global consistency, and high availability, designed for massive, global, mission-critical applications.
Preemptible VMs are highly affordable, short-lived compute instances suitable for batch jobs and fault-tolerant workloads. They are offered at a steep discount (up to 91% less than regular instances), but Google can terminate (preempt) them at any time if it requires access back to those resources.
Google Kubernetes Engine is a managed, production-ready environment for deploying containerized applications. It brings the power of Kubernetes orchestration, integrating deeply with GCP's load balancing, networking, and security features.
BigQuery is a fully managed, serverless, and highly scalable enterprise data warehouse. It enables super-fast SQL queries against terabytes or petabytes of data using the processing power of Google's infrastructure, without needing a database administrator.
Cloud Pub/Sub is a fully managed real-time messaging service that allows you to send and receive messages between independent applications. It provides asynchronous, many-to-many communication, decoupling senders and receivers to build highly scalable event-driven systems.
To ensure high availability within a single region, I would deploy the application's resources across multiple Zones. A zone constitutes an independent failure domain within a region. By placing Compute Engine instances or GKE clusters across at least two or three zones in the same region and using a regional load balancer, the application remains available even if an entire zone experiences an outage.
For a simple, stateless web container, Cloud Run is the most straightforward and fully managed option. It abstracts away all infrastructure management, automatically scales from zero to handle traffic spikes, and charges only for the exact resources the container uses while processing requests.
The 'Standard' storage class is the most appropriate. It is designed for frequently accessed data, provides low latency, and is the best fit for serving website content, streaming videos, or interactive use cases where data is accessed immediately and often.
I should assign the developer a 'Viewer' role (either at the project level or for specific services). In GCP IAM, the Viewer role grants read-only access to view resources and their configurations without the ability to create, modify, or delete them.
I would configure Cloud NAT (Network Address Translation). Cloud NAT enables instances in private subnets (without external IP addresses) to access the internet for updates or external API calls while remaining inaccessible to inbound traffic originating from the internet.
Cloud SQL is the best fit. It is a fully managed relational database service for MySQL, PostgreSQL, and SQL Server that automatically handles tedious administrative tasks like backups, replication, patch management, and capacity management.
I should use GCP Billing Budgets and Alerts. I can set a specific budget amount for the project or billing account and configure threshold rules (e.g., 50%, 90%, 100%) to send email notifications or trigger Pub/Sub messages when my spending approaches or exceeds those limits.
Google Cloud Deployment Manager (or increasingly, Terraform via Google Provider) is the right tool. Deployment Manager allows you to specify all the resources needed for your application in a declarative format (YAML or Python) and deploy them together as a unified deployment.
App Engine (Standard Environment) is designed for this. It is a fully managed Platform-as-a-Service (PaaS) where you just upload your source code, and App Engine automatically handles the provisioning, deployment, load balancing, and scaling of instances based on traffic.
BigQuery is the right product. It is a serverless, highly scalable, and cost-effective multi-cloud data warehouse designed for business agility. It allows you to run fast SQL queries on massive datasets without needing to manage any underlying infrastructure.
I would use Cloud Run. Because the application needs to scale to zero (saving costs during idle times) and scale rapidly to handle thousands of requests, Cloud Run's serverless container model is ideal. I would ensure the container image is small and the application logic is optimized for fast cold starts.
I would use the Global External Application Load Balancer (formerly HTTP(S) Load Balancing). It provides a single global Anycast IP address and routes traffic to the backend service closest to the user, ensuring the lowest possible latency and balancing load across regions.
I would use a Service Account. I would create a dedicated Service Account with the 'Storage Object Viewer' role on the specific bucket. Then, I would attach this Service Account to the Compute Engine VM. The application can then use the built-in metadata server to securely retrieve short-lived access tokens to access the bucket.
Firestore is the best choice. It is a fully managed, scalable, serverless document database well-suited for mobile, web, and server development. It offers strong consistency globally and provides built-in offline synchronization and real-time updates for client applications.
This should be implemented using VPC Network Peering. VPC Peering allows you to connect two VPC networks privately. Traffic between the peered networks stays entirely within Google's internal network, maintaining high throughput and low latency without going over the public internet.
Transitioning to Terraform enables version control, repeatability, and automated audits of the infrastructure. Terraform's state management keeps track of the resources it manages. This allows Terraform to calculate differences between the desired configuration in code and the actual state in GCP, applying only the necessary changes and preventing accidental drift.
I would use Cloud Logging, which automatically ingests logs from GKE. I would create a sinks or use log queries to filter for specific error severities. Then, I would create a Log-based Metric based on this filter, and finally, set up an Alerting Policy in Cloud Monitoring to notify the team (via email, Slack, etc.) if that metric exceeds the defined threshold.
I should use Spot VMs (formerly Preemptible VMs). Spot VMs are excess Compute Engine capacity offered at a heavily discounted price (up to 91% off). They can be reclaimed by Google at any time, but since the batch workload is fault-tolerant and restartable, they are a perfect fit for cost reduction.
The most decoupled approach is to use Cloud Functions (or Cloud Run) triggered by Eventarc. Specifically, I would configure the Cloud Storage bucket to emit an event on object creation, which triggers the Cloud Function to run the Python script, process the image, and store the thumbnail in a destination bucket.
Memorystore (for Redis or Memcached) is the appropriate service. It provides a fully managed in-memory data store service built on scalable, secure, and highly available infrastructure, ideal for caching frequently accessed database queries to improve application latency and reduce backend load.
I would enable the Cluster Autoscaler on the GKE cluster. The Cluster Autoscaler continuously monitors the cluster for pods that cannot be scheduled due to resource limitations and automatically adds nodes to the node pool. Conversely, it scales down nodes if they are underutilized for a sustained period and their pods can be accommodated on other nodes.
The solution is Cloud Interconnect (specifically Dedicated Interconnect). This provides a direct physical connection between the on-premises network and Google's network via a colocation facility. It guarantees bandwidth, lower latency than VPN, and traffic does not traverse the public internet, meeting strict enterprise policies.
This is enforced centrally using Organization Policies. I would define a policy constraint (specifically, 'Resource Location Restriction') at the Organization or Folder level. By specifying the allowed regions in the policy, it acts as a guardrail, automatically denying any resource creation requests outside the permitted locations across all inherited projects.
The architecture would utilize Cloud Pub/Sub to decouple ingestion, acting as a highly scalable message queue to absorb the IoT data stream. Cloud Dataflow would subscribe to Pub/Sub to perform real-time stream processing, parsing, cleaning, and transforming the data. Dataflow would then stream the processed data directly into BigQuery tables for real-time analytics.
I would implement VPC Service Controls (VPC SC). By wrapping the project in a VPC SC perimeter, I define a secure boundary around GCP APIs. Even if a user has valid IAM credentials, access to Cloud Storage or BigQuery is denied unless the request originates from within the perimeter or an explicitly defined trusted ingress rule (like an Access Level tied to corporate IP ranges).
Cloud SQL is insufficient because its vertical scaling limits write throughput globally natively, and traditional synchronous replication across continents degrades performance unacceptably. The right choice is Cloud Spanner. Spanner provides strict serializability (strong consistency) globally, scales writes horizontally across regions transparently, and offers up to 5 nines of availability, designed specifically for this tier of global relational workload.
This is a Canary Deployment. I would deploy the new revision to Cloud Run but configure it to serving zero traffic initially. Then, using Cloud Run's built-in traffic management, I would split traffic, allocating 90% to the stable revision and 10% to the new revision. After monitoring metrics in Cloud Logging/Monitoring, I would gradually increase the traffic to the new revision until it handles 100%.
Shared VPC is the required architecture. The Network Security team operates a 'Host Project' where they define the central VPC network, subnets, and firewall rules. The dev teams operate 'Service Projects', which are attached to the Host Project. Devs can deploy resources (like VMs or GKE) into the Service Projects, but those resources reside on the subnets managed centrally in the Host Project.
This is achieved using Identity-Aware Proxy (IAP). IAP sits in front of the application (attached to the Load Balancer) and verifies user identity (via Google Workspace/Cloud Identity) and context (device posture, IP) before granting access. It enforces Zero Trust access, ensuring only authenticated and authorized users reach the application without needing a complex VPN setup.
GKE Enterprise (formerly Anthos) addresses this. It provides a consistent management platform for multi-cloud and hybrid Kubernetes environments. Using components like Anthos Config Management (for declarative policy enforcement) and Anthos Service Mesh (for secure communication and observability), it centralizes governance and simplifies operations across the entire mixed fleet.
I would leverage Migrate to Virtual Machines (Migrate for Compute Engine). The strategy involves continuous replication of on-prem data to GCP. Phase 1 is 'Test Clone', validating the VM in GCP iteratively without impacting production. Phase 2 leverages 'Cutover' for minimal downtime sync. Phase 3 relies on thorough validation. If issues occur, the tight integration allows for falling back to the on-prem original source until sign-off.
For GKE, I would run Active-Passive or Active-Active clusters across two regions (e.g., us-central1, us-east4), routing traffic via a Global External Load Balancer connected to Multi-Cluster Ingress. For Cloud Storage, I would use Dual-Region buckets for transparent multi-region redundancy. For Cloud SQL (assuming PostgreSQL), I would configure Cross-Region Read Replicas. During a disaster, I'd promote the cross-region replica to primary to meet the RPO/RTO demands.
The foundation requires a strict Resource Hierarchy (Folders representing business units) mapped to isolated Billing Accounts. I would enforce Organization Policies heavily (e.g., denying external IPs, restricting regions, mandating labels). For FinOps, I would configure Billing Data Export to BigQuery to build granular Looker dashboards analyzing spend by the mandated labels. Lastly, I'd utilize Active Assist recommendations and custom scripts querying the Cloud Asset Inventory to identify and automate the cleanup of idle resources.
This is orchestrated centrally using Dataplex. Each domain maintains its data in specific BigQuery datasets (zones). Dataplex manages data product sharing using Authorized Views or Analytics Hub to prevent physical duplication. Central IT enforces compliance by applying Data Catalog tags identifying PII across the mesh, and linking those tags to IAM constraints to implement row/column-level security consistently, regardless of which domain owns the underlying table.
Data plane optimization requires deep observability. First, I would ensure trace sampling is optimized (Cloud Trace) to capture outliers. Next, I would examine the Envoy proxy metrics centrally aggregating in Cloud Monitoring to identify bottlenecks (CPU starvation on sidecars, connection pooling limits). Optimization involves tuning Envoy concurrency, implementing aggressive sidecar scoping (Sidecar resource) so proxies only map routes to dependent services rather than the entire mesh, and potentially evaluating eBPF-based dataplanes if proxy overhead is prohibitive.
For outgoing API access: Configure the VM in a private subnet, utilizing Cloud NAT. Restrict outbound traffic via VPC Firewall Rules to only the specific API IPs. For developer SSH: Implement OS Login coupled with Identity-Aware Proxy (IAP) TCP forwarding. Developers authenticate via Google identity. IAP establishes a secure tunnel to the VM's SSH port without public IPs. All authentication and authorization events are centrally logged in Cloud Audit Logs, adhering strictly to zero trust.
Private Service Connect (PSC) is the most robust topology here. The central core microservices (in the hub project) are exposed as a Producer service via a PSC endpoint. The individual tenant VPCs act as Consumers. They connect to the central service using an internal IP address specific to their own VPC via a PSC forwarding rule. This provides absolute isolation, scales massively, and entirely avoids complexities regarding IP address space overlaps across tenant networks.
The pipeline utilizes Vertex AI Pipelines. Code changes trigger Cloud Build, which containerizes the training code and pushes to Artifact Registry. The pipeline retrains the model on fresh BigQuery data and validates accuracy. If acceptable, it registers the model in Vertex AI Model Registry. Deployment is orchestrated to a Vertex AI Endpoint. For A/B testing, the Endpoint's traffic splitting feature routes a percentage of traffic to the new model version. Vertex AI Model Monitoring is configured continuously to detect feature skew or concept drift, automatically triggering retraining when necessary.
For ordering, I enable Pub/Sub 'Message Ordering' based on specific ordering keys, ensuring sequential processing by the subscriber. To handle poisonous messages, I configure Dead-Letter Topics (DLT) with strict retry policies to offload failing messages without blocking the queue. For exactly-once processing (which Pub/Sub now supports natively), I enable the feature on subscriptions, but crucial system design dictates that the consumer logic (e.g., Cloud Run or Dataflow) remains inherently idempotent, utilizing database constraints (like Spanner transaction IDs) to silently drop duplicate delivery anomalies.
Hotspots in Spanner during bulk inserts are generally caused by monotonically increasing primary keys (like timestamps or sequential IDs), causing all writes to hit a single split (server node). To re-architect, I would employ key hashing or salting. I would prefix the primary key with a hash of a business-relevant string (like a user ID) or calculate a shard ID to uniformly distribute writes across all available splits, maximizing Spanner's horizontal write capacity and eliminating the serialization bottleneck.