Conversation
| } | ||
|
|
||
| setupLog.Info("Attempting to connect to MongoDB...") | ||
| // TODO: Should we really fail, if connection to mongoDB fails? |
There was a problem hiding this comment.
Need suggestions here
Should we exit, if mongoDB connection fails? Given the main functionality here is to provision/delete the VM, we can just log, if mongoDB connection fails. And notifications won't be sent.
e880482 to
a140814
Compare
| os.Exit(1) | ||
| } | ||
|
|
||
| setupLog.Info("Attempting to connect to MongoDB...") |
There was a problem hiding this comment.
Any reason for connecting to mongoDB at controller start up?
I feel we can do the db connection during reconcile if we want to store some events on demand
| if err := scope.NotifyServiceCreationFailure(errorMsg); err != nil { | ||
| scope.Logger.Error(err, "failed to create failure notification event") | ||
| } |
There was a problem hiding this comment.
This might produce events during every reconcile right if VM is in error state?
There was a problem hiding this comment.
You might want to make it idempotent, if the event for given service id and name already exists, then do not need to update the event in DB
There was a problem hiding this comment.
I feel we can have in-memory cache for now (maybe later external-cache like redis), store the events there and then peridiocally maybe every 10-15 mins dump in DB. Without this you might end up making lot of DB calls I believe?
There was a problem hiding this comment.
This is something, that is supposed to be notified immediately, given the mail is sent based on events from DB, caching here wouldn't be right is what I think.
There was a problem hiding this comment.
May be, it could me more like a rate limiter, for a given VM instance, if the failure is seen more than x times in n seconds, we can send the event
There was a problem hiding this comment.
There was one more which @anshuman-agarwala said today that we can get the older VM state and current VM state, if older was error and the newer is also error then do not send. If its transitioned from anything else to error (meaning first time error), then notify.
Pls do check with him regarding this.
There was a problem hiding this comment.
@anshuman-agarwala , I checked the PVM_instance_spec. I checked the cloud docs as well. No spec clearly talks about previous health state. Only one field that could closely relate here is PVMInstanceHealth
I checked the cloud docs as well. It doesn't clearly specify anything about VM's older state.
There was a problem hiding this comment.
@mayuka-c . I have added a rate limiter. As you are already aware, it doesn't persist across restarts. But, for now, this should look ok.
There was a problem hiding this comment.
It's a controller-runtime concept, you can take a look at adding an admission webhook or a shared informer to the controller for this. We can discuss further offline.
a140814 to
0bc0116
Compare
Signed-off-by: Guna K Kambalimath <Guna.Kambalimath@ibm.com>
0bc0116 to
9dddd04
Compare
| var ( | ||
| notificationCache = make(map[string]time.Time) | ||
| cacheMutex sync.RWMutex | ||
| minIntervalMinutes = 30 |
There was a problem hiding this comment.
Since the cache is in-memory this can cause multiple emails in a short time if the controller goes into a restart loop, right?
There was a problem hiding this comment.
Yes, this can happen. In the long run, we should persist this using DB
| // collection exists | ||
| return true, nil | ||
| } | ||
| } No newline at end of file |
There was a problem hiding this comment.
.I ran gofmt, on a few files, and this got added, will remove it.
| logMessage := fmt.Sprintf("Service '%s' creation failed. Reason: %s", s.Service.Name, errorMessage) | ||
| event.SetLog(models.EventLogLevelERROR, logMessage) | ||
|
|
||
| dbCon, disconnect, err := connectDB(s.Logger) |
There was a problem hiding this comment.
It will be better if we can maintain a single long living connection instead of recreating the connection every time this method gets called. Hypothetically if a bunch of VMs fail at the same time then this will create a bunch of connections to the DB potentially causing slowdown/crash on the DB as well.
There was a problem hiding this comment.
The original changes was actually like that,
#75 (comment)
Mayuka added this comment that contradicts with yours. I think, maintaining a connection would be a better idea
There was a problem hiding this comment.
True, better to have a single persisted connection rather
There was a problem hiding this comment.
@GunaKKIBM We can go with single DB connection but please do make it a singleton (if the connection doesn't exist or if fails -> create new one or resuse the same connection)
| if err := scope.NotifyServiceCreationFailure(errorMsg); err != nil { | ||
| scope.Logger.Error(err, "failed to create failure notification event") | ||
| } |
There was a problem hiding this comment.
@mayuka-c . I have added a rate limiter. As you are already aware, it doesn't persist across restarts. But, for now, this should look ok.
|
|
||
| // collection exists | ||
| return true, nil | ||
| } |
There was a problem hiding this comment.
looks like, this got updated because of gofmt
| // collection exists | ||
| return true, nil | ||
| } | ||
| } No newline at end of file |
There was a problem hiding this comment.
.I ran gofmt, on a few files, and this got added, will remove it.
| logMessage := fmt.Sprintf("Service '%s' creation failed. Reason: %s", s.Service.Name, errorMessage) | ||
| event.SetLog(models.EventLogLevelERROR, logMessage) | ||
|
|
||
| dbCon, disconnect, err := connectDB(s.Logger) |
There was a problem hiding this comment.
The original changes was actually like that,
#75 (comment)
Mayuka added this comment that contradicts with yours. I think, maintaining a connection would be a better idea
| var ( | ||
| notificationCache = make(map[string]time.Time) | ||
| cacheMutex sync.RWMutex | ||
| minIntervalMinutes = 30 |
There was a problem hiding this comment.
Yes, this can happen. In the long run, we should persist this using DB
| } | ||
|
|
||
| // Notify both user and admin | ||
| event.SetNotifiyBoth() |
There was a problem hiding this comment.
Do we need to notify user as well? I think sending a email to user might not be needed. @mayuka-c what do you think
There was a problem hiding this comment.
Yeah I think not required for user, only admin should be fine.
This PR notifies the user and admin via email, if VM creation fails/VM is in error state.