-
Notifications
You must be signed in to change notification settings - Fork 38
fix: Fixed dispatching of URL for worker in remote content mode #209
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
jirihnidek
commented
Feb 20, 2024
- When worker was in remote content mode, then yggdrasil still expects raw data as valid JSON document. URL like: http://example.com/ is not valid JSON document. It could be e.g. "http://example.com/"
- Not sure if this is the best solution, but it solves the issue described in issue yggdrasil does not dispatch, when worker is remote content mode #208
3ce7c59
to
33ab195
Compare
internal/work/dispatcher.go
Outdated
} | ||
if config.DefaultConfig.DataHost != "" { | ||
URL.Host = config.DefaultConfig.DataHost | ||
} | ||
|
||
// Try to get content from given URL |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment is superfluous. The line below it clearly describes itself in its code language without the need for additional documentation. This might seem pedantic, but unnecessary comments contribute to the overall entropy of a codebase. In the future, if this section of code were to be refactored, the author would be expected to ensure that the comments are still accurate to the code they are commenting.
internal/work/dispatcher.go
Outdated
URL, err := url.Parse(string(data.Content)) | ||
// The type of data.Content is raw string, but it has to be JSON string fragment | ||
// containing URL. It means that it has to be something like '"http://example.com"' | ||
// and not only 'http://example.com' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change is going to break existing message dispatch services that send a URL that isn't encapsulated in a JSON string fragment. I think we might have to somehow determine whether it's a raw URL or a URL encapsulated in a JSON string fragment. Maybe something like:
URL, err := url.Parse(string(data.Content))
if err != nil {
var urlStr string
err := json.Unmarshal(data.Content, &urlStr)
if err != nil {
return fmt.Errorf()
}
URL, err := url.Parse(urlStr)
...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If URL is not encapsulated in a JSON string fragment or it isn't some JSON document, then it raises error here:
[yggd] 2024/03/04 16:26:25 /home/jhnidek/github/redhat-insights/yggdrasil/internal/transport/mqtt.go:72: cannot receive data message: cannot unmarshal data message: unexpected end of JSON input
So... I do not have any good idea how to be compatible with old services.
If we will have to break old services, what about requesting sending URL in more complex JSON document. It would be possible to specify method, HTTP headers, body, etc. Something like this:
{
"url": "http://example.com/path/to/app",
"method": "GET",
"x-my-header": "foo"
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like that idea; it streamlines all the assumptions that message content is raw JSON. But now I'm not so sure about this entirely; message content was meant to be opaque to yggdrasil. Now, with it being json.RawMessage
, we're assuming message content is some JSON-serializable format. What about workers that want to send/receive BSON, or some other binary format? They'll be required to encode their content into something like base64 first. I need to step back and think about how yggdrasil 0.2 is handling this.
Plus, I'm not sure how we're going to handle this protocol break with existing services yet. Will they be forced to maintain logic that encodes the message content in JSON if the rhc version is greater than or equal to 0.3?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just looked at the yggdrasil-0.2
branch and it defines the yggdrasil.Data.Content
field as json.RawMessage
. Additionally, the code that handles the URL unmarshals the message content into a string:
var urlString string
if err := json.Unmarshal(data.Content, &urlString); err != nil {
log.Errorf("cannot unmarshal message content: %v", err)
return
}
URL, err := url.Parse(urlString)
if err != nil {
log.Errorf("cannot parse message content as URL: %v", err)
return
}
I think this means that we are not, in fact, breaking existing services after all.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In experimenting with this PR, I don't think the requirement of having the
URL encapsulated in a string is required. Unmarshalling the content into a
string, as is done here in line 213, succeeds so long as the data can be
interpolated into a string, since a URL string can be, it succeeds. (I hope this
makes sense.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm talking myself in circles. I'll leave the comments above for posterity, but they're not accurate, nor factual. The code proposed by this PR is correct. The content must first be unmarshalled into a string. This is because all incoming content must be a JSON-serializable data type. In the case of these "remote content" workers, it is correct to require that the content be a URL string. It is an error to directly parse the content as a URL, since the content is typed as raw JSON. Existing services are not broken by this change either. They correctly set the value of the "content" field to a string containing a URL. It's the responsibility of the client to correctly unmarshal the content when handling these "remote content" workers.
I find it helpful to think of the json.RawMessage
data type as "everything after the colon". Consider the following Go (play.go.dev link):
type Person struct {
Name string `json:"name"`
RawName json.RawMessage `json:"raw_name"`
}
func main() {
jsonDocument := []byte(`{"name":"Testy McTestface","raw_name":"Testy McTestface II"}`)
var person Person
if err := json.Unmarshal(jsonDocument, &person); err != nil {
fmt.Printf("cannot unmarshal JSON: %v", err)
}
fmt.Printf("%#v", person)
}
The "Name" field gets its value set to the string "Testy McTestface". The "RawName" field gets its value set to a byte slice that includes the leading and trailing quotations. This is evident in the output:
main.Person{Name:"Testy McTestface", RawName:json.RawMessage{0x22, 0x54, 0x65, 0x73, 0x74, 0x79, 0x20, 0x4d, 0x63, 0x54, 0x65, 0x73, 0x74, 0x66, 0x61, 0x63, 0x65, 0x20, 0x49, 0x49, 0x22}}
The byte 0x22
is "
.
Tying this all back to the yggdrasil.Data.Content
field. Because it is typed as a json.RawMessage
, when unmarshalling it from a JSON document into a yggdrasil.Data
value, the "Content" field gets its value set to "everything after the colon" (including the quotation marks). So in order to properly parse it as a URL, it must first be unmarshalled from its JSON string format into a Go string.
This is as much an explanation for myself as it is to anyone else who might read this PR later.
Now, this comment is not factually correct either. If we must comment this at all, I think it needs to read something along the lines of:
// Because the content field is typed as json.RawMessage, it must first be unmarshalled into a Go string before parsing as a URL.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@subpop Hi, I updated the comment. Is there anything else that should be fixed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, I think this is good.
33ab195
to
4b12a7e
Compare
* When worker was in remote content mode, then yggdrasil still expects raw data as valid JSON document. URL like: http://example.com/ is not valid JSON document. It could be e.g. "http://example.com/" * Solves issue #208
4b12a7e
to
1fb9fd7
Compare