【下篇】通過GA4,Bigquery及Data Studio製作CoreWebVitals報表

茄子妹妹

發佈時間: 2022-08-10 02:05:28

更新時間: 2022-08-10 02:27:09

讀者可能會有疑問,即然GA4可以直接關聯Data Studio,為什麼還需要關聯GA4與Bigquery,在Bigquery上做出最後的報表然後再關聯Data Studio,這難到不是多此一舉嗎?那麼接下來我為大家解答一下:

【上篇】通过GA4,Bigquery及Data Studio制作CoreWebVitals报表我有詳細解釋過:GA4直接關聯到Data Studio時,來自GA4的數據源都是聚合數據而非用戶層級數據,且時常出現Data Studio上報表數據不穩定的情況;而關聯Bigquery可以解決以上問題,拿到明細數據底表、創建穩定的數據報表。如下所示是以上兩種方式的不同特性:

GA4直接關聯Data StudioGA4關聯Bigquery,再關聯Data Studio
用戶緯度數據缺少用戶層級數據,無法將底表直接導入Data Studio在Bigquery上跑出底表,關聯至Data Studio,可以基於用戶層級的數據來做分析
報表穩定性自定義上報到GA4的事件有時無法影射到Data Studio上的field上,且在Data Studio上更換數據源後,報表經常會不穩定。Bigquery報表與Data Studio field一一影射,且不存在報表不穩定的情況
Pageview分組Data Studio不會按照pageview對web vitals事件進行分組利用Bigquery我們可以自定義提前按照Pageview來做分組

總結來說,我們希望可以獲取到單個用戶層級訪問了一次頁面時所對應的web vitals的數據,如下圖所示。

75分位的數據底表

1. 創建Google Cloud帳戶與項目

為了將GA4與Bigquery關聯產出報表,首先你需要創建一個Google Cloud 帳戶並創建一個項目。

2. GA4關聯Bigquery

接著,前往GA4 Admin(管理中心),在property底下選擇Bigquery Linking,點擊後會吊起Google Cloud的彈窗後,選擇創建好的Google Cloud 項目。記得在配置時需要勾選Daily Export,勾選後,從GA4導入之Bigquery的數據就會每日定時更新。

GA4關聯Bigquery關聯bigquery時勾選Daily,定時更新數據

3. Bigquery上數據導入成功

完成GA4與bigquery的關聯配置後,你需要等待24小時左右的一段時間,才能在Google Cloud Platform上你創建的項目下看到關鍵好的表格,關聯成功的表格差不多長這個樣子:

4. Bigquery上運行腳本,產出我們所需要的數據格式

Bigquery上的表格已經生成,接下來我們需要將數據轉轉化成Data Studio可識別的格式,最終我們是希望可以在Data Studio上看到每個頁面瀏覽的Web Vitals得分是多少。

我們通過SQL來做分組,運行腳本、生成報表。在event表裡點擊query,選擇in new table。將以下的SQL複製粘貼到query框裡,同時記得將your_project.analytics_123456789替換成你自己的項目名稱與表名稱

#  Replace target table name
CREATE OR REPLACE TABLE `your-project.analytics_123456789.web_vitals_summary`
  PARTITION BY DATE(event_timestamp)
  CLUSTER BY metric_name
AS
SELECT
  ga_session_id,
  IF(
    EXISTS(SELECT 1 FROM UNNEST(events) AS e WHERE e.event_name = 'first_visit'),
    'New user',
    'Returning user') AS user_type,
  IF(
    (SELECT MAX(session_engaged) FROM UNNEST(events)) > 0, 'Engaged', 'Not engaged')
    AS session_engagement,
  evt.* EXCEPT (session_engaged, event_name),
  event_name AS metric_name,
  FORMAT_TIMESTAMP('%Y%m%d', event_timestamp) AS event_date,
  
  # addition 1 START
  CASE event_name
    WHEN 'CLS' THEN 
        CASE 
            WHEN metric_value <= 0.1 THEN 'Good'
            WHEN metric_value <= 0.25 THEN 'Needs Improvement'
            ELSE 'Poor'
        END
    WHEN 'LCP' THEN 
        CASE 
            WHEN metric_value <= 2.500 THEN 'Good'
            WHEN metric_value <= 4.000 THEN 'Needs Improvement'
            ELSE 'Poor'
        END
    WHEN 'FID' THEN 
        CASE 
            WHEN metric_value <= 100 THEN 'Good'
            WHEN metric_value <= 300 THEN 'Needs Improvement'
            ELSE 'Poor'
        END
    WHEN 'TTFB' THEN 
        CASE 
            WHEN metric_value <= 500 THEN 'Good'
            WHEN metric_value <= 1500 THEN 'Needs Improvement'
            ELSE 'Poor'
        END
    WHEN 'FCP' THEN 
        CASE 
            WHEN metric_value <= 2.000 THEN 'Good'
            WHEN metric_value <= 4.000 THEN 'Needs Improvement'
            ELSE 'Poor'
        END
  END AS metric_status
  # additions 1 END

FROM
  (
    SELECT
      ga_session_id,
      ARRAY_AGG(custom_event) AS events
    FROM
      (
        SELECT
          ga_session_id,
          STRUCT(
            country,
            device_category,
            device_os,
            traffic_medium,
            traffic_name,
            traffic_source,
            page_path,
            debug_target,
            event_timestamp,
            event_name,
            metric_id,
            # Tony's modification to support TTFB and FCP
            IF(event_name = 'LCP' OR event_name = 'TTFB' OR event_name = 'FCP', metric_value / 1000, metric_value) AS metric_value,
            user_pseudo_id,
            session_engaged,
            session_revenue,

            # additions 2 START
            page_location,
            page_type,
            continent,
            region,
            device_browser,
            effective_connection_type
            # additions 2 END

            ) AS custom_event
        FROM
          (
            SELECT
              (SELECT value.int_value FROM UNNEST(event_params) WHERE key = 'ga_session_id')
                AS ga_session_id,
              (SELECT value.string_value FROM UNNEST(event_params) WHERE key = 'metric_id')
                AS metric_id,
              ANY_VALUE(device.category) AS device_category,
              ANY_VALUE(device.operating_system) AS device_os,
              ANY_VALUE(traffic_source.medium) AS traffic_medium,
              ANY_VALUE(traffic_source.name) AS traffic_name,
              ANY_VALUE(traffic_source.source) AS traffic_source,
              ANY_VALUE(
                REGEXP_SUBSTR(
                  (SELECT value.string_value FROM UNNEST(event_params) WHERE key = 'page_location'),
                  r'^[^?]+')) AS page_path,
              ANY_VALUE(
                (SELECT value.string_value FROM UNNEST(event_params) WHERE key = 'debug_target'))
                AS debug_target,
              ANY_VALUE(user_pseudo_id) AS user_pseudo_id,
              ANY_VALUE(geo.country) AS country,
              ANY_VALUE(event_name) AS event_name,
              SUM(ecommerce.purchase_revenue) AS session_revenue,
              MAX(
                (
                  SELECT
                    COALESCE(
                      value.double_value, value.int_value, CAST(value.string_value AS NUMERIC))
                  FROM UNNEST(event_params)
                  WHERE key = 'session_engaged'
                )) AS session_engaged,
              TIMESTAMP_MICROS(MAX(event_timestamp)) AS event_timestamp,
              MAX(
                (
                  SELECT COALESCE(value.double_value, value.int_value)
                  FROM UNNEST(event_params)
                  WHERE key = 'metric_value'
                )) AS metric_value,

                # additions 3 START
                ANY_VALUE((SELECT value.string_value FROM UNNEST(event_params) WHERE key = 'page_location')) AS page_location,
                ANY_VALUE((SELECT value.string_value FROM UNNEST(event_params) WHERE key = 'page_type')) AS page_type,
                ANY_VALUE(geo.continent) AS continent,
                ANY_VALUE(geo.region) AS region,
                ANY_VALUE(device.web_info.browser) AS device_browser,
                ANY_VALUE((SELECT value.string_value FROM UNNEST(event_params) WHERE key = 'effective_connection_type')) AS effective_connection_type
                # additions 3 END

            FROM
              # Replace source table name
              `your-project.analytics_123456789.events_*`
            WHERE
              # modification to support TTFB and FCP
              event_name IN ('LCP', 'FID', 'CLS', 'TTFB', 'FCP', 'first_visit', 'purchase')
            GROUP BY
              1, 2
          )
      )
    WHERE
      ga_session_id IS NOT NULL
    GROUP BY ga_session_id
  )
CROSS JOIN UNNEST(events) AS evt
WHERE evt.event_name NOT IN ('first_visit', 'purchase');

複製粘貼後,你可以嘗試第一次run這個query,若成功跑完,將會在你的analytics_表名下生成一張名為‘web_vitals_summary’的結果表,若未成功、出現報錯的話,你需要查看一下你是否將your_project.analytics_123456789替換成你自己的項目名稱與表名稱。

Bigquery運行腳本

5. 增加定時腳本(可選)

你可以將這段SQL保存(點擊save),用作後續的re-run。你也可以設置一個定時任務來跑這段SQL,但是鑑於Bigquery每個帳戶每月是有quota限制的,如果要定時跑的話,可能會超限額(超了就需要付費),所以前期我建議還是手動的根據需求來跑吧。

6. 關聯Google Data Studio,創建自動化報表

運行腳本後,你可以看到名為“web_vitals_summary”的結果表,在“preview”結果表裡,點擊最右上角的Export,下拉選擇Explore with Data Studio,將結果表裡所有的字段與緯度以及其對應的數據倒入至Data Studio裡,就可以開始製作你的Data Studio數據表了!手動更新Bigquery上的腳本,Data Studio上的報表也會自動更新,是不是很方便很高效呢!

我自己有產出一個適配此腳本Data Studio報表模板,但是有讀者反饋無法與自己的結果數據的關聯起來(報表會出錯),這個問題解決後我將模板分享出來。對於Data Studio報表製作不是很熟悉的朋友,可以前往“如何在Data Studio上創建Core Web Vitals的性能報表”。