sara_b
Level 4

App Portal 2016 with SCCM delivery: failures and timers

Using App Portal 2016 with SCCM delivering the applications, we are struggling with trying to minimize the failure rate. One of the things that we noticed is that, by default, App Portal removes the target from the collection when it's marked failed. Unfortunately for us, that also meant that we had no audit trail to find out what was causing the failures. 
 
To mitigate, we've gone through and unchecked that so that the target will stay in the collection to provide additional data. This is proving a challenge in other ways. Namely, now, since the target is staying in the collection, App Portal can time out/fail the installation, but the software will be installed successfully after that time out. We lengthened the "Automatically fail software requests with no status after " timer from 48 hours to 96 hours. The issue with doing this though is that our failure messages generate trouble tickets for our help desk. Now, if there's a legitimate reason the install is failing, the user has to wait 4 days for that ticket to auto generate, because it's continually trying.
 
Are there any tricks I'm missing here? What are your time outs set as? And looking at the documentation for App Portal 2016, what is the "Monitor threshold for failed deployments to prevent false flags " timer for? The documentation doesn't have a value set, nor does it define what exactly it does.
0 Kudos
8 Replies
CharlesW
Flexera
Flexera

Sara,
You had mentioned that when the device was removed from the collection on a failure, you were loosing your audit trail.. Can you elaborate more on how you typically diagnose failures? Just wondering if there is something that you could do from within a web service command or action in a failure scenario, so that you would not loose data prior to removing the device from the collection. For example, is there something we could query in SCCM, and then write it to an App Portal log file?..

The "Monitor threshold for failed deployments to prevent false flags" timer tells App Portal how long to continue to keep checking status after an installation is initially reported as a failure.. By default, this will be 12 hours. . This basically allows SCCM enough time to try the install again after a failure (as described in your post).. I don't think that this will offer a solution to your original problem, as after this period has elapsed, the device will still be removed from the collection, resulting in the same loss of your audit trail.

Thanks
Charles
0 Kudos

Charles,

You mention that by default, the "auto fail" timer is set to 24 hours, the documentation actually shows it as 168 hours? 

The technicians tell me that the logs they get don't tell them WHY the installation failed unless we're leaving it in the collection. Also, this allows for reimages of machines to automatically install software already assigned to the target. 

jdempsey,

Reviewing the data is the reason we extended the auto fail timer in the first place from 48 to 96 hours. 

I think part of the problem we're (my company ) is having is potentially a communication problem. I had a user bring me an example. They requested a software on 6/12. The software was installed that afternoon and was functional (didn't error out or anything). On 6/17, they received a failure notice. When I initially looked at App Portal, it had a failure for both machine policy and installation. I retried machine policy and it succeeded, but the installation is still listed as failed. 

Question about the last timer - if no value is set - will it continue to monitor indefinitely? 

0 Kudos

Leaving that last timer with no value will not cause it to monitor indefinitely.  It has a default value when no value is supplied.  I don't recall for sure, but I think it might be something like 2 hours.  Charles, can you confirm what the default value is for the "false flags" timer when no value is specified?

Regarding the "incorrect status", your best bet is when you encounter such an issue, you can open a case with our Support team and provide the App Portal logs, the App Portal Web Service logs (from the SCCM server), and the SCCM logs (both server and client).  Support can then help track down what's happening and determine if timers need to be adjusted or if there is some other issue.  In the example you gave here, it could be that the deployment was initially successful and then later retried the installation (because it was still in the collection or perhaps re-added to the collection) and failed because it was already installed.  In this case, the most recent status from SCCM would indicate a failure, so App Portal would reflect that failure.  If that's what happened, it could be something that could be addressed by tweaking some of the timers, but we wouldn't know without further investigation.

BTW, what the technicians are likely talking about is that in-console SCCM dashboards and reporting will not show what happened with the deployment once the device is removed from the collection because the reports query status messages for current targets of the deployment based on the associated collection membership; however, the activity related to that client should all be in the logs.  You can start with the App Portal server logs.  They will tell you if we attempted to add the device to the collection or not.  Then you can look at the App Portal web service logs on the SCCM server.  They will tell you if the device was actually successfully added to the collection or not.  If it was, then the SCCM server logs should show activity related to that device and that deployment.  Assuming the policy was created for that client, you can then look at the SCCM client logs on that device to see what happened with the deployment.  If there is no activity related to that deployment policy, then the client never retrieved the policy.  If there is activity related to that deployment, then you'll see whether it was successful or if it failed and why.  None of this requires the device to remain in the collection.

Anything expressed here is my own view and not necessarily that of my employer, Flexera. If my reply answers a question you have raised, please click "ACCEPT AS SOLUTION".
0 Kudos

The "FailMonitorThreshold" value defaults to 12 hours if the value is not explicitly set. 

jdempsey
Moderator Moderator
Moderator

Based on what you describe, I don't think you really want to lengthen the "Automatically fail software requests with no status after" timer.  When the client evaluates policy and acknowledges that it has something to do, we get a status back of "Policy Received".  If we never get that status back within the period set for this timer (starting from when the device is dropped into the collection), that's when we mark it as failed.  I generally set this timer at 24 hours because if the user has placed a request for the software, their machine is generally online and checking for policy within 24 hours.  Keep in mind that if there is an approval process in the middle, that may not always be the case (e.g. user requests software on a Friday afternoon and shuts down for the weekend before the request gets approved later that evening).  If this particular timer is causing problems, you may need to review a decent cross-section of requests to see what "typical" timing looks like.

There is another timer for "no change in status".  This timer is used to track the time between when we get an initial status from SCCM and when that status changes to something else.  For example, you may get a status of "Waiting for content".  If that status doesn't change to something else within the period set for this timer, we will mark it as failed (even though the client may eventually receive the content and successfully install).  I believe the default value for this timer is 168 hours (7 days), which I normally leave as-is.  7 days will typically account for situations like long weekends or large downloads, which could cause the status not to change for an extended period of time.  If you need to, you could set this value differently on individual catalog items that may take more or less time than is typical for others.

Finally, there is a timer for "Monitor threshold for failed deployments to prevent false flags".  This timer was added specifically for the scenario you're describing.  The idea is that we would normally stop monitoring status for a request after receiving a failure notification from SCCM.  However, if that request is retried and later succeeds, we wouldn't know because we stopped monitoring and marked it as failed.  By using this timer, you can specify a period of time that App Portal will wait after receiving a failed status before triggering the On Fail Install/Uninstall event and stopping the monitoring of the request.  Again, if you have certain "problem" packages, you can set this timer differently on a per catalog item basis.  I normally don't configure this timer, but know of a few customers that have used it with success.

Anything expressed here is my own view and not necessarily that of my employer, Flexera. If my reply answers a question you have raised, please click "ACCEPT AS SOLUTION".

On the "audit trail" topic, you should be able to see a lot of what is going on by looking at the App Portal request log for the request, the App Portal Web Service logs on the SCCM server, and the SCCM client logs on the device.  It would be rare that you couldn't diagnose the failure reason using a combination of those logs.

Anything expressed here is my own view and not necessarily that of my employer, Flexera. If my reply answers a question you have raised, please click "ACCEPT AS SOLUTION".
0 Kudos

@jdempsey@CharlesW   We are also facing the same issue, where App Portal is marking the request as failure. but the software has been installed on machine.

"Automatically fail software requests with no change in status after" timer is set to 168 hours.

We wanted to know which status App Portal looks for, before it marks the request as failure OR how App Portal determines that an application is installed or not?

 

 

 

 

0 Kudos

We query deployment status from the SCCM database.  For packages, this uses status messages.  For applications, this uses state messages.  Since state messages indicate the current state, you will only see the current state and no history for how it got to that point.  For history, you'd have to look at the SCCM logs on the client.  In contrast, status messages show you what the status was any time status changes, so you can see the full history of the deployment simply by querying the status messages in the database.

I believe these queries should be very similar to the queries we are using within the product code...

--Step 1:
--Run the SQL query below in App Broker DB. Replace the highlighted values accordingly. This will return the SCCMAdvertisementID

USE [AppPortal]

BEGIN
DECLARE @appPortalRequestID AS int
DECLARE @machineName AS nvarchar(15)
DECLARE @advertID AS nvarchar(255)

SET @appPortalRequestID = 7410
SET @machineName = 'SCCM-2012'

SELECT @advertID = sta.AdvertID
FROM WD_SiteToAdvert sta
INNER JOIN WD_WebPackages wp ON wp.PackageID = sta.PackageID
INNER JOIN wd_packagerequests pr ON pr.packageid_fk = sta.packageid and pr.requesttype = sta.[Type]
WHERE pr.RequestID = @appPortalRequestID and sta.[Type] = 0

SELECT @advertID AS SCCMAdvertisementID

END
GO

--Step 2: (if using Applications) 
--Run the SQL below in SCCM DB. Replace the highlighted values accordingly. Use the SCCMAdvertisementID returned above for the @advertID variable.
--The Status columns will give you high level status, and the EnforcementState value can be mapped to a description in the App Broker admin UI.

--NOTE: This only works for Applications 

USE [CM_FX0]

BEGIN
DECLARE @machineName AS nvarchar(15)
DECLARE @advertID AS nvarchar(255)

SET @machineName = 'SCCM-2012'
SET @advertID = '16779305'

SELECT MachineName, AssignmentID, StartTime, CollectionID, CollectionName, StatusType,
       CASE StatusType WHEN 1 THEN 'successful' WHEN 5 THEN 'failed' ELSE 'pending' END AS StatusName,
       ComplianceState, EnforcementState, InstalledState
FROM vAppDeploymentAssetDetails
WHERE MachineName = @machineName
AND AssignmentID = @advertID

END
GO

--Step 3: (if using packages or task sequences)
--Run the SQL below in SCCM DB. Replace the highlighted values accordingly. Use the SCCMAdvertisementID returned above for the @advertID variable.
--The MessageStateName should give you current overall status, while the MessageName should show the same detailed status as in the App Broker UI.

--NOTE: This only works for Packages and Task Sequences

USE [CM_FX0]

BEGIN
DECLARE @machineName AS nvarchar(15)
DECLARE @advertID AS nvarchar(255)
DECLARE @query AS nvarchar(MAX)
DECLARE @p3 AS xml

SET @machineName = 'SCCM-2012'
SET @advertID = 'FX020005'

SET @query = N'SELECT StatusMessages.MachineName, StatusMessages.SiteCode, StatusMessageAttributes.AttributeValue AS AdvertID, StatusMessages.Time, OfferStatusInfo.MessageName, OfferStatusInfo.MessageState, OfferStatusInfo.MessageStateName, OfferStatusInfo.MessageID & 0x0000FFFF AS LastStatusID, StatusMessages.RecordID FROM StatusMessages INNER JOIN OfferStatusInfo ON StatusMessages.ID = OfferStatusInfo.MessageID INNER JOIN StatusMessageAttributes ON StatusMessages.RecordID = StatusMessageAttributes.RecordID WHERE (StatusMessageAttributes.AttributeID = 401) AND (StatusMessages.Type = 258) and StatusMessages.MachineName = ''' + @machineName + N''' AND EXISTS (SELECT * FROM (select T.c.value(''.'',''nvarchar(255)'') as Id from @Ids.nodes(''/L/I'') T(c)) Ids where StatusMessageAttributes.AttributeValue = Ids.id) ORDER BY StatusMessages.Time DESC'
SET @p3=convert(xml,N'<L><I>' + @advertID + N'</I></L>')

EXEC sp_executesql @query,N'@Ids xml',@Ids=@p3

END
GO

 

Anything expressed here is my own view and not necessarily that of my employer, Flexera. If my reply answers a question you have raised, please click "ACCEPT AS SOLUTION".