Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Data is not correctly uploaded in the database? #841

Open
JulienPeloton opened this issue Apr 16, 2024 · 3 comments
Open

Data is not correctly uploaded in the database? #841

JulienPeloton opened this issue Apr 16, 2024 · 3 comments
Labels
bug Something isn't working hbase

Comments

@JulienPeloton
Copy link
Member

JulienPeloton commented Apr 16, 2024

Take https://fink-portal.org/ZTF24aahtdhb. The latest AVRO alert packet (20240416) gives

Image

Note that here, noisy data and valid data have the same markers. The portal, at the same period, gives:

Screenshot from 2024-04-16 14-39-24

Possible scenarios:

  1. HBase jobs have failed
  2. Code to populate Hbase table for upper limit is not correct (recent changes Add more columns for uppervalid alerts #785)
@JulienPeloton JulienPeloton added bug Something isn't working hbase labels Apr 16, 2024
@JulienPeloton
Copy link
Member Author

JulienPeloton commented May 16, 2024

Some more inside: the script (fink-broker/bin/index_archival.py) looks at the most recent measurement in the history, and finds if the status was upper or uppervalid. If it is valid, the whole history is not considered.

In an ideal world, if the point in the history is valid, no need to push data from the history indeed, because it was pushed when receiving data from the previous valid alert then. But this assumes we did not miss the previous valid alert! Missing an alert can happen if: Fink was down or HBase crashed. This does not happen frequently, but it does. In that case (e.g. receiving the 2nd valid alert), we just start from the last valid alert, but we do not push data in the history...

How could we check a valid alert in the history has been indeed processed and pushed by Fink? (without pushing all data blindly as we were doing before...)

@JulienPeloton
Copy link
Member Author

JulienPeloton commented May 16, 2024

For that specific night and that specific object though (2024/03/27) something fishy happened. The history says that the previous alert was valid, and that it happened on the same night:

import pyspark.sql.functions as F
from fink_utils.spark.utils import check_status_last_prv_candidates

df = spark.read.format('parquet').load('archive/science/year=2024/month=03/day=27')
df = df.filter(df['objectId'] == 'ZTF24aahtdhb')

kinds = ['valid', 'upper', 'uppervalid']
for kind in kinds:
  df = check_status_last_prv_candidates(df, status=kind)

df.select(kinds).show()
+-----+-----+----------+
|valid|upper|uppervalid|
+-----+-----+----------+
| true|false|     false|
+-----+-----+----------+

df.select(F.element_at('prv_candidates.jd', -1)).show()
+---------------------------------+
|element_at(prv_candidates.jd, -1)|
+---------------------------------+
|                  2460396.8996991|
+---------------------------------+

But looking at the raw data, no sign of it:

df_raw = spark.read.format('parquet').load('archive/raw/year=2024/month=03/day=27')
df_raw = df_raw.filter(df_raw['objectId'] == 'ZTF24aahtdhb')
df_raw.count()
Out[35]: 1

I inspected streaming & database logs, but there is no sign of problems...

@JulienPeloton
Copy link
Member Author

And another generic problem: we check only the last measurement, and push data if it is upper or uppervalid. But truly only one at a time. So if the last measurement is upper, but the history contains earlier uppervalid, we will not push these uppervalid... 😭 and vice-versa for uppervalid.

@JulienPeloton JulienPeloton transferred this issue from astrolabsoftware/fink-science-portal May 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working hbase
Projects
Status: Bugs
Development

No branches or pull requests

1 participant